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^ way derived products in the plaslids of transgenic plants and microalgae, (2) create novel antibiotic resistant transgenic plants and 
^ microalgae, and (3) create a novel selection system and/or targeting sites for mediating the insertion of genetic material into plant 

and microalgae plaslids. The specific polynucleotides to be used, solely or in any combination thereof, are publicly available from 
^ GeneBank and contain open reading frames having sequences that upon expression will produce active proteins with the following cn- 

zymc activities: (a) acetoacctyl CoA thiolase (EC 2.3.1.9), (b) 3-hydroxy-3-melhylgluiaryl-cocnzymc A (HMG-CoA) synthase (EC 
O 4.1.3.5), (c) HMG-CoA reductase (EC LI. 1.34), (d) mevalonatc kinase (EC 2.7.1.36), (e) phosphomevalonate kinase (EC 2.7.4.2), 

(0 mevalonatc diphosphate decarboxylase (EC 4. 1 . 1 .33), (g) isopentcnyl diphosphate (IPP) isomerasc (EC 5.3.3.2), and (b) phyloene 
^ synthase (EC 2.5. 1.32). 
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DESCRIPTION 

MANIPULATION OF GENES OF THE MEVALONATE AND ISOPRENOID 
PATHWAYS TO CREATE NOVEL TRAITS IN TRANSGENIC ORGANISMS 

5 

Cross-Reference to a Related Application 
This application claims the benefit of U.S. Provisional Application No. 
60/221,703, filed July 31, 2000. 

10 Field of the Invention 

This invention relates to the fields of biotechnology and genetic engineering, in 
particular to agricultural and aquacultural biotechnology. More specifically, the invention 
relates to transgenic plants and microalgae, in particular to transplastomic plants and 
microalgae and means for insertion of genetic material into plastids. 

15 

Background of the Invention 
The ubiquitous isoprenoid biosynthetic pathway is responsible for the formation 

of the most chemically diverse family of metabolites found in nature (Hahn et al, J. 

BacterioL 178:619-624, 1996) including sterols (Popjak, Biochemical symposium no. 29 
20 (T. W. Goodwin, ed.) Academic Press, New York, pp 17-37, 1970), carotenoids 

(Goodwin, Biochem. J. 123:293-329, 1971), dolichols (Matsuoka et al, J. Biol. Chem. 

266:3464-3468, 1991), ubiquinones (Ashby and Edwards, J. Biol. Chem. 

265:13157-13164, 1990), and prenylated proteins % (Clarke, Annu. Rev. Biochem. 

61:355-386, 1992). Biosynthesis of isopentenyl diphosphate (IPP), the essential 5-carbon 
25 isoprenoid precursor, occurs by two distinct compartmentalized routes in plants (Lange 

and Croteau, Proc. Natl. Acad. Sci. USA 96:13714-13719, 1999). In the plant cytoplasm, 

IPP is assembled from three molecules of acetyl coenzyme A by the well-characterized 

mevalonate pathway (Lange and Croteau, Proc. Natl. Acad. Sci. USA 96:13714-13719, 

1999). However, a recently discovered mevalonate-independent pathway is responsible 
30 for the synthesis of IPP in plant chloroplasts (Lichtenthaler et al FEBS Letters 

400:271-274,. 1997). ; ' " 

Following the synthesis of IPP via the mevalonate route, Hie carbon-carbon double 
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bond must be isomerized to create the potent eJectrophiJe dimethylally diphosphate 
(DMAPP). This essential activation step, carried out by IPP isomerase, insures the 
existence of the two 5-carbon isomers, IPP and DMAPP, which must join together in the 
first of a series of head to tail condensation reactions to create the essential allylic 
5 diphosphates of the isoprenoid pathway (Hahn and Poulter, J. Biol. Chem. 
270: 1 1298-1 1 303, 1 995). Recently, it was reported that IPP isomerase activity was not 
essential in R coli, one of many eubacteria containing only the non-mevalonate pathway 
for the synthesis of both 5-carbon isomers, suggesting the existence of two separate 
mevalonate-independent routes to IPP and DMAPP (Hahn et al. y J. Bacteriol. 

10 181:4499-4504, 1999). Thus, it is unclear whether an IPP isomerase is essential for the 
synthesis of isoprenoids in plant plastids as well. Regardless of whether IPP isomerase 
activity is present in plant plastids, the separation by compartmentalization of the two 
different biosynthetic routes, the mevalonate and deoxyxylulose phosphate pathways (or 
"non-mevalonate"), for IPP and DMAPP biosynthesis in plants is the fundamental tenet 
. 1 5 upon which the subject inventions are based. 

The synthesis of IPP by the mevalonate pathway (Eisenreich et ah, Chemistry and 
Biology 5:R221-R233, 1998) is cytoplasm based and occurs as follows: The 
condensation of two acetyl CoA molecules to yield acetoacetyl CoA is catalyzed by 
acetoacetyl CoA thiolase (EC 2.3.1.9). The addition of another molecule of acetyl CoA 

20 to acetoacetyl CoA is catalyzed by 3-hydroxy-3-methylglutaryl-coenzyme A (HMG-CoA) 
synthase (EC 4.1.3.5) to yield HMG-CoA, which is reduced in the subsequent step to 
mevalonate by HMG-CoA reductase (EC 1.1.1.34). Mevalonate is phosphorylated by 
mevalonate kinase (EC 2.7.1 .36) to yield phosphomevalonate, which is phosphorylated, 
by phosphomevalonate kinase (EC 2.7.4.2) to form mevalonate diphosphate. The 

25 conversion of mevalonate diphosphate to IPP with the concomitant release of C02 is 
catalyzed by mevalonate diphosphate decarboxylase (EC 4.1.1 .33). 

In organisms utilizing the deoxyxylulose phosphate pathway (aka "non-mevalonate 
pathway", 6c methylerytliritol phosphate (MEP) pathway", and "Rohmer pathway"), the 
five carbon atoms in the basic isoprenoid unit are derived from pyruvate and 

30 D-glyceraldehyde phosphate (GAP) (Eisenreich et al, 1998), Thus, synthesis of IPP 
and/or DMAPP by the non-nievalonate route, which occurs in plastids, is as follows: 
Pyruvate and GAP are condensed to give 1-deoxy-D-xylulose 5-phosphate (DXP) by 

2 
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DXP synthase (Sprenger et aL 9 Proc. Natl. Acad. Sci. USA 94:12857-12862, 1997). The 
rearrangement and reduction of DXP to form 2-C-methylerythritol 4-phosphate (MEP), 
the first committed intermediate in the non-mevalonate pathway for biosynthesis of 
isoprenoids is catalyzed by DXP reductoisomerase (Kuzuyama et al, Tetrahedron Lett. 
5 39:4509-4512, 1998). MEP is then appended to CTP to form 4-(cytidine 
5'-diphospho)-2- C-methyl-D-erythritol (Rohdich et al, Proc. Natl. Acad. Sci. USA 
96: 1 1 758-1 1 763, 1 999), followed by phosphorylation of the C2 hydroxyl group (Liittgen 
et al, Proc. Natl. Acad. Sci. USA 97:1062-1067, 2000) and elimination of CMP, to form 
a 2,4-cyclic diphosphate (Herz et al, Proc. Natl. Acad. Sci. USA 97:2486-2490, 2000). 
1 0 Interestingly, Herz et al reported the possible existence of Afunctional proteins with both 
YgbP and YgbB activities. Once the remaining steps to the fundamental five-carbon 
isoprenoid building blocks, IPP and DMAPP, in the non-mevalonate pathway are 
discovered, they will serve as additional targets for inhibitors with antiobiotic and 
herbicidal activity. 

1 5 Since the non-mevalonate pathway is ultimately responsible for the biosynthesis 

of compounds critical for photosynthesis such as the prenyl side-chain of chlorophylls, 
which serve as lipophillic anchors for the photoreceptors and the photoprotective 
carotenoid pigments, any enzyme, gene, or regulatory sequence involved in the 
biosynthesis of IPP and/or DMAPP can be a potential target for herbicides. For example, 

20 the antibiotic fosmidomycin, a specific inhibitor of the enryme DXP reductoisomerase 
(Kuzuyama et al, Tetrahedron Lett. 39:7913-7916, 1998) has been shown to have 
significant herbicidal activity, especially in combination with other herbicides (Kamuro 
et al "Herbicide" U.S. Patent No. 4,846,872; issued July 11, 1989). The report of an 
Arabidopsis thaliana albino mutant being characterized as a disruption of the CLA1 gene, 

25 later revealed as encoding DXP synthase by Rohmer et al (Lois et al , Proc. Natl. Acad. 
Sci. USA 95:2105-2110, 1998), also illustrates the potential of non-mevalonate pathway 
enzymes as targets for compounds with herbicidal activity. Accordingly, onei of ordinary 
skill in the art can readily understand that as additional compounds are discovered 
exhibiting herbicidal activity based on their effects on the non-mevalonate pathway, those 

30 compounds could be used in accord with the teachings herein. 

The synthesis of carotenoids from IPP and DMAPP takes place in plant plastids by 
a genetically- and enzymatically-defined pathway (Cunningham and Gantt, Ann. Rev. 
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Plant Mol. Biol. 39:475-502, 1998). Enhanced production of carotenoids such as 
lycopene and B-carotene in plants is highly desirable due to the reported health benefits 
of their consumption (Kajiwara et al, Biochem. J. 324:421-426, 1997). Enhanced 
carotenoid production in plants can also have a dramatic effect on their coloration and be 
5 highly desirable to the growers of ornamentals, for example. The IPP isomerase reaction 
is considered to be a rate-limiting step for isoprenoid biosynthesis (Ramos-Valdivia et 
al., Nat. Prod. Rep. 6:591-603, 1997). Kajiwara et al reported that the expression of 
heterologous IPP isomerase genes in a strain of E. coli specifically engineered to produce 
carotenoids resulted in over a 2-fold increase in J}-carotene formation. Recently, it has 

1 0 been reported that expression of an additional gene for DXP synthase in an E. coli strain 
specifically engineered to produce carotenoids also increased the level of lycopene 
substantially (Harker and Bramley, FEBS Letters 448:115-119, 1999). Increased 
isoprenoid production also has been shown in bacteria by combining carotenogenic genes 
from bacteria with an orf encoding IPP isomerase; and was even further enhanced when 

1 5 additionally combined with the dxs gene from the MEP pathway to supply the precursors 
IPP and DMAPP (Albrecht et al Nature Biotechnology 18: 843- 846, 2000). 

Accumulation of one specific isoprenoid, such as beta-carotene (yellow-orange) 
or astaxanthin (red-orange), can serve to enhance flower color or nutriceutical 
composition depending if the host is cultivated as an ornamental or as an output crop; and 

20 if the product accumulates in the tissue of interest (/. e. flower parts or harvestable tissue). 
In plants, tissue with intrinsic carotenoid enzymes can accumulate ketocarotenoids such 
as astaxanthin in chromoplasts of reproductive tissues of tobacco by addition of the 
biosynthetic enzyme beta-carotene ketolase (Mann et al, Nature Biotechnology 18: 
888-892, 2000). Astaxanthin is the main carotenoid pigment found in aquatic animals; 

25 in microalgae it accumulates in the Chlorophyta such as in species of Haematococcus and 
Chlamydomonas. Thus, an increase in the essential 5- carbon precursors, IPP and 
DMAPP, by expression of bifs encoding IPP isomerase and orfs upstream thereof, can 
feed into the production output of such valuable isoprenoids in organisms other than 
bacteria. 

30 As a further example of utility, Petunia flower color is usually due to the presence 

of modified cyanidin and delphinidin anthocyanin pigments to produce shades in red to 
blue groupings. Recently produced yellow seed-propagated multiflora and grandiflora 
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petunias obtain their coloration from the presence of beta-carotene, lutein and zeaxanthin 
carotenoid pigments in combination with colorless flavonols (Nielsen and Bloor, Scienia 
Hort. 71: 257-266, 1997). Industry still lacks bright yellow and orange clonally 
propagated trailing petunias. Metabolic engineering of the carotenoid pathway is desired 
5 to introduce these colors in this popular potted and bedding plant. 

Plant genetic engineering has evolved since the 1980s from arbitrarily located 
monocistronic insertions into a nuclear chromosome, often subject to multiple copies, 
rearrangements and methylation, to predetermined sites for defined multicistronic or 
multigenic operon insertions into a plastid chromosome (plastome), which thus far is 

1 0 thought impervious to typical nuclear gene inactivation. While breeding of crop plants 
by nuclear genome engineering is nevertheless a proven technology for major agronomic 
crops and for traits such as herbicide resistance, introgression of genes into the plastome 
is a highly promising breeding approach for several reasons as described by Bock and 
Hagemann (Bock and Hagemann, Prog. Bot. 61 :76-90, 2000). Of note is the containment 

15 of transgenes in the transplastomic plant: Plastids are inherited through the maternal 
parent in most plant species and dius plastid-encoded transgenes are unable to spread in 
pollen to non-target species. Therefore plastid engineering can minimize negative 
impacts of genetically engineered plants. A report on potential transfer by pollen of 
herbicide resistance into weedy relatives of cultivated crops (Keeler et al, Herbicide 

20 Resistant Crops: Agricultural, Economic, Environmental, Regulatory and Technological 
Aspects, pp. 303-330, 1996) underscores the value of using plastid engineering rather 
than nuclear engineering for critical production traits such as herbicide resistance. 
Daniell et al have recently demonstrated herbicide resistance through genetic engineering 
of the chloroplast genome (Daniell et al , Nat. Biotechnol., 1 6:345-348, 1 998). 

25 Moreover, plastids are the site of essential biosynthetic activity. Although most 

associate photosynthesis as the primary function of the chloroplast, studies document that 
the chloroplast is the center of activity for functions involving carbon metabolism, 
nitrogen metabolism, sulfur metabolism, biochemical regulation, and various essential 
biosynthetic pathways including amino acid, vitamin, and phytohormone biosynthesis. 

30 Crop traits of interest such as nutritional enhancement require genetic manipulations that 
impact plastid biosynthetic pathways such as carotenoid production. While 
nuclear-encoded gene products can be exported from the engineered nucleus into the 
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plastid for such manipulations, the biosynthetic genes themselves can be inserted into the 
plastid for egression and activity. As we begin to pyramid multiple genes often required 
for pathway manipulations (such as the aforementioned carotenoid biosynthesis) the 
repeated use of selection markers is expected to lead to unstable crops through 
5 homology-dependent gene silencing (Meyer and Saedler, Ann. Rev. Plant. Physiol. Mol. 
Biol. 47:23-48, 1996). In addition, the requirement for higher expression levels of 
transgenes for effective phenotypes such as vitamin levels and herbicide and pest 
resistance levels often falls short in nuclear transformations. These deficiencies are 
overcome through plastid transformation or combining plastid with nuclear 

1 0 transformations: The plastid recognizes strings of genes linked together in multicistronic 
operons and, due to the high copy number of genes within a plastid and within plastids 
in a cell, can produce a hundred- to thousand-fold the amount of transgene product. 
Accordingly, there is a continuing need for improved methods of producing plants having 
transformed plastids (transplastomic plants). 

1 5 Golden rice is one example for which plastid engineering can complement nuclear 

engineering of pathways that reside in the plastid, yet have met with limited success. The 
metabolic pathway for beta-carotene (pro-vitamin A) was assembled in rice plastids by 
introduction into the nuclear genome of four separate genes, three encoding 
plastid-targeted proteins using three distinct promoters, plus a fourth selectable marker 

20 gene using a repeated promoter (Ye et al Science 287:303-305, 2000). The wild-type 
rice endosperm is free of carotenoids but it does produce geranylgeranyl diphosphate; 
combining phytoerie synthase, phytoene desaturase, and lycopene-beta cyclase resulted 
in accumulation of beta-carotene to make "golden rice". However, the quantity produced 
was lower than the minimum desired for addressing vitamin A deficiency. An increased 

25 supply of precursors for increasing intermediates, such as geranylgeranyl diphosphate, 
is predicted to significantly increase isoprenoid production. Insertion of an operon 
encoding the entire mevalonate pathway into the rice plastome of the "golden rice" 
genotype, using for example the methods as described in Khan and Maliga, Nature 
Biotechnology 17: 910-914, 1999, can provide a means for making improvements in 

30 metabolic engineering of this important monocot crop. 

Proplastid and chloroplast genetic engineering have been shown to varying degrees 
of homoplasmy for several major agronomic crops including potato, rice, maize, soybean, 

6 
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grape, sweet potato, and tobacco including starting from non-green tissues. Non-lethal 
selection on antibiotics is used to proliferate cells containing plastids with antibiotic 
resistance genes. Plastid transformation methods use two plastid-DNA flanking 
sequences that recombine with plastid sequences to insert chimeric DNA into the spacer 
5 regions between functional genes of the plastome, as is established in the field (see Bock 
and Hagemann, Prog. Bot. 61:76-90, 2000, and Guda et al 9 Plant Cell Reports 
19:257-262, 2000, and references therein). 

Antibiotics such as spectinomycin, streptomycin, and kanamycin can shut down 
gene expression in chloroplasts by ribosome inactivation. These antibiotics bleach leaves 

10 and form white callus when tissue is put onto regeneration medium in their presence. 
The bacterial genes aadA and neo encode the enzymes 
aminoglycoside-3 '-adenyltransferase and neomycin phosphotransferase, which inactivate 
these antibiotics, and can be used for positive selection of plastids engineered to express 
these genes. Polynucleotides of interest can be linked to the selectable genes and thus can 

1 5 be enriched by selection during the sorting out of engineered and non-engineered plastids. 
Consequently, cells with plastids engineered to contain genes for these enzymes (and 
linkages thereto) can overcome the effects of inhibitors in the plant cell culture medium 
and can proliferate, while cells lacking engineered plastids cannot proliferate. Similarly, 
plastids engineered with polynucleotides encoding enzymes from the mevalonate pathway 

20 to produce IPP from acetyl CoA in the presence of inhibitors of the non-mevalonate 
pathway can overcome otherwise inhibitory culture conditions. By utilizing the 
polynucleotides disclosed herein in accord with this invention, an inhibitor targeting the 
non-mevalonate pathway and its components can be used for selection purposes of 
transplastomic plants produced through currently available methods, or any future 

25 methods which become known for production of transplastomic plants, to contain and 
express said polynucleotides and any linked coding sequences of interest. 

This selection process of the subject invention is unique in that it is the first 
selectable trait that acts by pathway complementation to overcome inhibitors. This is 
distinguished from the state of the art of selection by other antibiotics to which resistance 

30 is conferred by inactivation of the antibiotic itself, e.g. compound inactivation as for the 
aminoglyoside 3 -adenyltransferase gene or neo gene. This method avoids the occurrence 
of resistant escapes due to random insertion of the resistance gene into the nuclear 
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genome or by spontaneous mutation of the ribosomal target of the antibiotic, as is known 
to occur in the state of the art. Moreover, this method requires the presence of an entire 
functioning mevalonate pathway in plastids. For example, if one of the enzyme activities 
of the mevalonate pathway is not present in the plastid, resistance will not be conferred. 
5 There is strong evidence indicating that the origin of plastids within the cell 

occurred via endosymbiosis and that plastids are derived from cyanobacteria. As such, 
the genetic organization of the plastid is prokaryotic in nature (as opposed to the 
eukaryotic nuclear genome of the plant cell). The plastid chromosome ranges from 
roughly 110 to 150 Kb in size (196 for the green alga Chlamydomonas), much smaller 

1 0 than that of most cyanobacteria. However, many of the bacterium genes have either been 
lost because their function was no longer necessary for survival, or were transferred to 
the chromosomes of the nuclear genome. Most, but not all, of the genes remaining on the 
plastid chromosome function in either carbon metabolism or plastid genetics. However, 
many genes involved in these functions, as well as the many other functions and 

15 pathways intrinsic to plastid function, are also nuclear encoded, and the translated 
products are transported from the cytoplasm to the plastid. Studies have documented 
nuclear encoded genes with known activity in the plastid that are genetically more similar 
to homologous genes in bacteria rather than genes of the same organism with the same 
function but activity in the cytoplasm as reviewed for example in Martin et al (1998) 

20 Nature 393:162-165 and references therein. 

The process whereby genes are transported from the plastid to the nucleus has 
been addressed. Evidence indicates that copies of many plastid genes are found among 
nuclear chromosomes. For some of these, promoter regions and transit peptides (small 
stretches of DNA encoding peptides that direct polypeptides to the plastid) become 

25 associated with the gene that allows it to be transcribed, and the translated polypeptide 
relocated back into the plastid. Once this genetic apparatus has become established, the 
genes present in the plastid chromosome may begin to degrade until they are no longer 
functional, i. e. , any such gene becomes a pseudogene. 

As is common in prokaryotic systems, many genes that have a common function 

30 are organized into an operon. An operon is a cluster of contiguous genes transcribed 
from one promoter to give rise to a polycistron mRNA. Proteins from each gene in the 
polycistron are then translated. There are 18 operons in the plastid chromosome of 

T 
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tobacco (Nicotiana tabacum). Although many of these involve as few as two genes, some 
are large and include many genes. Evolutionary studies indicate that gene loss- as 
pseudogenes or completely missing sequences- occurs as individuals rather than as blocks 
of genes or transcriptional units. Thus other genes surrounding a pseudogene in a 
5 polycistronic operon remain functional. 

The rpl23 operon consists of genes whose products are involved in protein 
translation. Most of these genes are ribosbmal proteins functioning in either the large or 
small ribosomal subunit. One particular gene of note, infA, encodes an initiation factor 
protein that is important in initiating protein translation. Although this gene is functional 

10 in many plants, it is a pseudogene in tobacco and all other members of that family 
(Solanaceae), including the horticulturally valuable tomato, petunia, and potato crops. 
A recent survey of plant groups has indicated that there have been numerous loses of 
functionality of m£A (Millen et al 9 Plant Cell 13: 645-658, 2001). This as well as other 
pseudogenes are identified in species whose chloroplast genomes have not yet been fully 

15 sequenced. 

Pseudogenes such as infA become potential target sequences for insertion of 
intact orfs. Inserted orfs are controlled by regulatory upstream and downstream elements 
of the polycistron and are promoterless themselves. Pseudogenes are known for a 
multiplicity of crops and algae with chloroplast genomes that are already fully sequenced. 

20 Crops include grains such as rice and trees such as Pinus. Of note in the latter are the 
eleven ndh genes; all may serve as potential targets for transgene insertion. 

Transplastomic solanaceous crops are highly desirable in order to eliminate the 
potential for gene transfer from engineered lines to wild species, as demonstrated in 
Lycopersicon (Dale, PJ. 1992. Spread of engineered genes to wild relatives. Plant 

25 Physiol. 100:13-15.). A method for plastid engineering that enables altered pigmentation, 
for improved nutrition in tomato or improved flower color in Petunia and ornamental 
tobacco as examples, is desirable for solanaceous crops. The infA gene is widely lost 
among rosids and some asterids; among the latter, infA is a pseudogene in all solanaceous 
species examined (representing 16 genera). The solanaceous infA DNA sequences show 

30 high similarity, with all nucleotide changes within infA being documented. Thus one set 
of flanking sequences of reasonable length as known in the art should serve for directed 
insertion of an individual or multiple orfs into the infA sites of the solanaceous species. 
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It is documented in a solanaceous species that flanking sequences for genes to be inserted 
into the plastome are not required to be specific for the target species, as incompletely 
homologous plastid sequences are integrated at comparable frequencies (Kavanagh et al , 
Genetics 152:1111-1122, 1999). 
5 The upstream 5' region, often referred to as the 5' UTR, is important on the 

expression level of a transcript as it is translated. Knowing the translation products of 
surrounding genes in a polycistron allows one to select a pseudogene site that is affiliated 
with a strong 5' UTR for optimizing plastid expression in a particular tissue. The plastid 
genome in many plant species can have multiple pseudogenes that are located in different 

10 polycistronic sites. So, if one has a choice, one can select a site based on whether it is 
actively transcribed in green vs non-green plastid; and then if the polycistron has high or 
low relative expression in that plastid type. Moreover, monocistronic mRNA of ndhD 
was detected in developed leaves but not in greening or expanding leaves of barley 
(Hordeum vulgare), despite this gene being part of a polycistronic unit as reported by del 

15 Campo et al (1997) Plant Physiol 114:748. Thus, one can time transgene product 
production by treating an inactive gene, based on developmental expression, as a 
pseudogene for targetting and integration purposes using the invention disclosed herein. 

Algal species are becoming increasingly exploited as sources of nutraceuticals, 
pharmaceuticals, and lend themselves to aquaculture. Mass production of the isoprenoid 

20 compound astaxanthin produced by the green microalga Haemotcoccus is one successful 
example of the above. Metabolic engineering that would increase product yields and 
composition in microalgae would significantly benefit the industry. The development of 
organellar transformation for the unicellular green alga Chlamydomonas reinhardtii, with 
its single large chloroplast, opens the door for conducting studies on genetic manipulation 

25 of the isoprenoid pathway. Filamentous or multicellular algae are also of interest as 
untapped biofactories, as are other nongreen algae whose pathways for producing unique 
fatty acidsi amino acids, and pigments can be ameliorated for commercial benefit. 

The biolistic DNA delivery method is a general means with which to transform 
the chloroplast of algae (Boyntonand Gillham, Methods Enzymol. 217:510-536, 1993). 

3 0 Sequencing of at least six plastomes from algae should facilitate transformation systems 
by confirming insertion sites, including pseudogene sites, and the regulatory elements 
directing heterologous gene expression. What is required is a dominant marker for 
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selection of stable transformants to which natural resistance is absent (Stevens and 
Purton, J. Phycol 33: 713-722, 1997). For Chlamydomonas, chloroplasts can be 
engineered using markers that confer spectinomycin resistance following their integration 
into the plastome via homologous recombination. By utilizing the polynucleotides 
5 disclosed herein in accord with this invention, an inhibitor targeting the non-mevalonate 
pathway and its components can be used for selection purposes of transplastomic algae 
produced through currently available methods, or any future methods which become 
known for production of transplastomic algae, to contain and express said 
polynucleotides and any linked coding sequences of interest. This is a novel selection 
10 vehicle for transplastomic algae. Moreover, elevating the supply of essential precursors 
for isoprenoid production in algae as described above is enabled by this invention. 

Summary of the Invention 
This invention relates to the presence of enzymatic activities necessary to form IPP 

15 from acetyl Co A, generally known as the mevalonate pathway, within plant and 
microalgae plastids. This invention may also require the presence of IPP isomerase 
activity within plastids resulting from the insertion into said plants and microalgae of a 
polynucleotide encoding a polypeptide with IPP isomerase activity. This invention may 
be achieved by the use of any polynucleotide, be it a DNA molecule or molecules, or any 

20 hybrid DNA/RNA molecule or molecules, containing at least one open reading frame that 
when expressed provides a polypeptide(s) exhibiting said activities within plastids. These 
open reading frames may be identical to their wild type progenitors, or alternatively may 
be altered in any manner (for example, with plastid-optimized codon usage), may be 
isolated from the host organism to be modified, may originate from another organism or 

25 organisms, or may be any combination of origin so long as the encoded proteins are able 
to provide the desired enzymatic activity within the target plastids. The described open 
reading frames may be inserted directly into plastids using established methodology or 
any methodology yet to be discovered. Alternatively, plastid localization of the desired 
activities may be achieved by modifying genes already residing in the cell nucleus, 

30 inserting foreign polynucleotides for nuclear residence, or inserting polynucleotides 
contained on exogenous, autonomous plasmids into the cell cytoplasm so that in all cases 
their encoded proteins are transported into the plastid. For example, a chloroplast transit 

11 
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(targeting) peptide can be fused to a protein of interest. Any combination of the above 
methods for realizing said activities in plant and microalgae plastids can be utilized. By 
causing the complete mevalonate pathway enzymatic activity to occur in plastids 
normally possessing only the non-mevalonate pathway, the presence of said activities 
5 within the chloroplasts of a specific plant or microalgae will endow it with resistance to 
a compound, molecule, etc. that targets a component of the non-mevalonate pathway, be 
it an enzyme, gene, regulatory sequence, etc., thereby also providing a useful selection 
system based on circumvention of the inhibition of the non-mevalonate pathway in 
transplastomic plants and microalgae. 

10 In addition, this invention relates to the use of open reading frames encoding 

polypeptides with enzymatic activities able to convert acetyl CoA to IPP, generally 
known as the mevalonate pathway, and a polypeptide with IPP isomerase activity as a 
method for increasing the production of IPP, DMAPP, and isoprenoid pathway derived 
products whose level within plant and microalgae plastids is dependent on the level of 

15 IPP and/or DMAPP present within the plastids. The presence of exogenous genes 
encoding l-deoxy-D-xylulose-5-phosphate synthase and IPP isomerase have been shown 
to increase the production of carotenoids in eubacteria, presumably due to an increased 
production of IPP and/or DMAPP. Thus, insertion of the entire mevalonate pathway, 
solely or coupled with an additional IPP isomerase, into plastids will increase the level 

20 of IPP and/or DMAPP, resulting in an increased level of carotenoids and other yet to be 
determined isoprenoid pathway derived products within plant and microalgae plastids. 
This invention can utilize an open reading frame encoding the enzymatic activity for IPP 
isomerase independently or in addition to said open reading frames comprising the entire 
mevalonate pathway to obtain the increased level of isoprenoid pathway derived products 

25 within plant and microalgae plastids. This invention may be achieved by the use of any 
DNA molecule or molecules, or any hybrid DNA/RNA molecule or molecules, 
containing open reading frames able to provide said activities within plant and microalgae 
plastids. These open reading frames may be identical to their wild type progenitors, may 
be altered in any manner, may be isolated from the plant to be modified, may originate 

30 from another organism or organisms, or may be any combination of origin so long as the 
encoded proteins are able to provide said activities within plastids. The described open 
reading frames may be inserted directly into plant and microalgae plastids using 
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established methodology or any methodology yet to be discovered. Alternatively, plastid 
localization of the desired activities may be achieved by modifying genes already residing 
in the nucleus, inserting foreign genes for nuclear residence, or inserting genes contained 
on exogenous, autonomous plasmids into the cytoplasm so that in all cases their encoded 

5 proteins are transported into the plastid. Any combination of the above methods for 
realizing said activities in plastids can be utilized. 

Further, this invention also relates to the direct insertion of any foreign gene into 
a plant or microalgae chloroplast by coupling it to the open reading frames encoding 
polypeptides with enzymatic activities able to convert acetyl CoA to IPP, thus comprising 

0 the entire mevalonate pathway. By utilizing a compound, molecule, etc. that targets a 
component of the non-mevalonate pathway be it an enzyme, gene, regulatory sequence, 
etc., a method of selection analogous to the use of kanamycin and spectinomycin 
resistance for the transformation event is achieved. As inhibition of the non-mevalonate 
pathway in a plant or microalgae results in the impairment of photosynthesis, the 

5 presence of the mevalonate pathway biosynthetic capability is apparent, thus enabling the 
facile screening of concomitant incorporation into plastids of a foreign gene coupled to 
the open reading frames comprising the entire mevalonate pathway. The use of a 
polynucleotide comprising an open reading frame encoding a polypeptide with IPP 
isomerase activity in addition to the open reading frames encoding the mevalonate 

0 pathway is a particularly preferred embodiment, which provides all enzymatic activities 
necessary to synthesize both IPP andlDMAPP and overcome the effect(s) of inhibition 
of the non-mevalonate pathway. 

Further, this invention is unique and novel in that the transforming DNA, that is 
integrated by two or more homologous/heterologous recombination events, is 

5 purposefully targeted into inactive gene sites selected based on prior knowledge of 
transcription in plastid type, developmental expression including post-transcriptional 
editing, and post-transcriptional stability. Additionally, this invention uses the regulatory 
elements of known inactive genes (pseudogenes) to drive production of a complete 
transforming gene unrelated to the inserted gene site. Thus, by utilizing the transgene 

0 insertion method disclosed herein in accord with this invention, any foreign gene can be 
targeted to an inactive gene site (the pseudogene) through currently available methods of 
gene transfer, or any future methods which become known for production of transgenic 



13 



WO 02/10398 



PCT/US01/24037 



and transplastomic plants, to contain and express said foreign gene and any linked coding 
sequences of interest. This gene insertion process of the subject invention is unique in 
that it is the first method specifically acting by pseudogene insertion to overcome the 
need for promoters and other regulatory elements normally associated with a 
5 transforming DNA vector while permitting site-specific recombination in organellar 
genomes. The use of the infA pseudogene insertion site in the solanaceous crops in 
particular is a preferred embodiment for the transformation of plastids using the open 
reading frames for the mevalonate pathway as well as for providing the necessary 
precursors for modified output traits in plants. 

10 

Brief Description of the Drawings 
FIG. 1 is a map of cloning vector pFCOl containing S. cerevisiae orfs encoding 
phosphomevalonate kinase (PMK), mevalonate kinase (MVK), and mevalonate 
diphosphate decarboxylase (MDD). 
15 FIG. 2 is a map of expression vector pFC02 containing & cerevisiae orfs 

encoding phosphomevalonate kinase (PMK), mevalonate kinase (MVK), and mevalonate 
diphosphate decarboxylase (MDD). 

FIG. 3 is a map of cloning vector pHKOl containing S. cerevisiae orf encoding 
acetoacetyl thiolase (AACT); A. thaliana orfs encoding HMG-CoA synthase (HMGS), 
20 HMG-CoA reductase (HMGRt). 

FIG. 4 is a map of expression vector pHK02 containing S. cerevisiae orfs 
encoding phosphomevalonate kinase (PMK), mevalonate kinase (MVK), mevalonate 
diphosphate decarboxylase (MDD), and acetoacetyl thiolase (AACT); A. thaliana orfs 
encoding HMG-CoA synthase (HMGS), HMG-CoA reductase (HMGRt) which in their 
25 summation are designated Operon A, encoding the entire mevalonate pathway. 

FIG. 5 is a map of cloning vector pHK03 containing S. cerevisiae orfs encoding 
phosphomevalonate kinase (PMK), mevalonate kinase (MVK), mevalonate diphosphate 
decarboxylase (MDD), and acetoacetyl thiolase (AACT); A. thaliana orfs encoding 
HMG-CoA synthase (HMGS), HMG-CoA reductase (HMGRt) which in their summation 
30 are designated Operon B, encoding the entire mevalonate pathway. 

FIG. 6 is an illlustration of how the mevalonate (MEV) pathway, by providing an 
alternative biosynthetic route to IPP, circumvents blocks in the MEP pathway due to a 

r 
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mutation in the gene for deoxyxylulose phosphate synthase (dxs) and due to inhibtion by 
fosmidoraycin of deoxyxylulose phosphate reductoisomerase (dxr). 

FIG. 7 is a map of vector pBSNT27 containing K tabcwn chloroplast DNA 
(cpDNA) and the K tabcum infA pseudogene and pBSNT27 sequence (SEQ ID NO: 17) 
5 FIG. 8 is a map of plastid transformation vector pHK04 containing K tabcum 

chloroplast DNA (cpDNA) flanking the insertion of Operon B into the infA pseudogene. 

FIG. 9 is a map of cloning vector pHK05 containing S. cerevisiae orfs encoding 
phosphomevalonate kinase (PMK), mevalonate kinase (MVK), and mevalonate 
diphosphate decarboxylase (MDD), and acetoacetyl thiolase (AACT); A. thaliana orfs 
10 encoding HMG-CoA synthase (HMGS), HMG-CoA reductase (HMGRt); R capsulatus 
orf encoding IPP isomerase (IPPI) which in their summation are designated Operon C, 
encoding the entire mevalonate pathway and IPP isomerase. 

FIG. 10 is a map of cloning vector pFHOl containing S. cerevisiae orf encoding 
acetoacetyl thiolase (AACT); A. thaliana orf encoding HMG-CoA synthase (HMGS); 
1 5 Streptomyces sp CL 1 90 orf encoding HMG-CoA reductase (HMGR). 

FIG. 1 1 is a map of cloning vector pFH02 containing & cerevisiae orfs encoding 
phosphomevalonate kinase (PMK), mevalonate kinase (MVK), and mevalonate 
diphosphate decarboxylase (MDD), and acetoacetyl thiolase (AACT); A. thaliana orf 
encoding HMG-CoA synthase (HMGS); Streptomyces sp CL190 orf encoding 
20 HMG-CoA reductase (HMGR) which in their summation are designated Operon D, 
encoding the entire mevalonate pathway. 

FIG. 12 is a map of cloning vector pFH03 containing S. cerevisiae orfs encoding 
phosphomevalonate kinase (PMK), mevalonate kinase (MVK), and mevalonate 
diphosphate decarboxylase (MDD), and acetoacetyl thiolase (AACT); A. thaliana orf 
25 encoding HMG-CoA synthase (HMGS); Streptomyces sp CL190 orf encoding 
HMG-CoA reductase (HMGR); R. capsulatus orf encoding IPP isomerase (IPPI) which 
in their summation are designated Operon E, encoding the entire mevalonate pathway and 
IPP isomerase. 

FIG. 1 3 is a map of cloning vector pFH04 containing a S. cerevisiae orf encoding 
30 acetoacetyl thiolase (AACT) coupled to the Streptomyces sp CL1 90 gene cluster which 
in their summation are designated Operon F, encoding the entire mevalonate pathway and 
IPP isomerase. ' 
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FIG. 14 is is a plastid transformation vector pHK07 containing N. tabacum 
chloroplast DNA (cpDNA) flanking the insertion of Operon C into the infk pseudogene. 

FIG. 15 is a map of expression vector pHK09 containing Operon B. 

FIG. 16 is a map of expression vector pHKlO containing Operon C. 
5 FIG 17 is a map of plastid transformation vector pFH06 containing N. tabacum 

chloroplast DNA (cpDNA) flanking the insertion of both Operon E and the R. capsulatus 
orf encoding phytoene synthase (PHS) into the in/A pseudogene. 

Brief Description of the Sequences 
10 SEQ ID NO: 1) is a PCR primer containing Saccharomyces cerevisiae DNA. 

SEQ ID NO: 2) is a PCR primer containing S. cerevisiae DNA. 

SEQ ID NO: 3) is a PCR primer containing S. cerevisiae DNA. 

SEQ ID NO: 4) is a PCR primer containing S. cerevisiae DNA. 

SEQ ID NO: 5) is a PCR primer containing S. cerevisiae DNA. 
1 5 SEQ ID NO: 6) is a PCR primer containing S. cerevisiae DNA. 

SEQ ID NO: 7) is a PCR primer containing Arabidopsis thalian.a DNA. 

SEQ ID NO: 8) is a PCR primer containing A. thaliana DNA. 

SEQ ID NO: 9) is a PCR primer containing A. thaliana DNA. 

SEQ ID NO: 1 0) is a~PCR primer containing A. thaliana DNA. 
20 SEQ ID NO: 1 1 ) is a PCR primer containing S. cerevisiae DNA. 

SEQ ID NO: 12) is a PCR primer containing S. cerevisiae DNA. 

SEQ ID NO: 13) is a Oligonucleotide containing S. cerevisiae DNA. 

SEQ ID NO: 14) is a Oligonucleotide containing^, thaliana and S. cerevisiae 

DNA. 

25 SEQ ID NO: 15) is an Oligonucleotide containing S. cerevisiae DNA. 

SEQ ID NO: 16) is an Oligonucleotide containing S. cerevisiae DNA. 
SEQ ID NO: 1 7) is Vector pBSNT27 containing Nicotiana tabacum DNA. 
SEQ ID NO: 18) is an Oligonucleotide containing^, tabacum and S. cerevisiae 

DNA. 

30 SEQ ID NO: 19) is an Oligonucleotide containing AT. tabacum and A. thaliana 



DNA. 



SEQ ID NO: 20) is a PCR primer containing Rhodobacter capsulatus DNA. 
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SEQ ID NO: 2 1) is a PCR is a primer containing R. capsulatus DNA. 
SEQ ID NO: 22) is a PCR primer containing Schizosaccharomyces pombe DNA. 
SEQ ID NO: 23) is a PCR primer containing S. pombe DNA. 
SEQ ID NO: 24) is a PCR primer containing Streptomyces sp CL190 DNA. 
5 SEQ ID NO : 25) PCR is a primer containing Streptomyces sp CL 1 90 DNA. 

SEQ ID NO: 26) is an Oligonucleotide containing S. cerevisiae DNA, 
SEQ ID NO: 27) is an Oligonucleotide containing S. cerevisiae DNA. 
SEQ ID NO: 28) is an Oligonucleotide containing Streptomyces sp CL190 and R. 
capsulatus DNA. 

10 SEQ ID NO: 29) is an Oligonucleotide containing R. capsulatus DNA. 

SEQ ID NO: 30) is an Oligonucleotide containing Streptomyces sp CL190 and S. 
cerevisiae DNA. 

SEQ ID NO: 31) is an Oligonucleotide containing Streptomyces sp CL190 DNA. 
SEQ ID NO: 32) is an Oligonucleotide containing N, tabacum and S. cerevisiae 

15 DNA. 

SEQ ID NO: 33) is an Oligonucleotide containing N. tabacum and R. capsulatus 

DNA. 

SEQ ID NO: 34) is an Oligonucleotide containing K tabacum and S. cerevisiae 

DNA. 

20 SEQ ID NO: 35) is an Oligonucleotide containing K tabacum and S. pombe 

DNA. 

SEQ ID NO: 36) is an Oligonucleotide containing NotI restriction site. 

SEQ ID NO: 37) is an Oligonucleotide containing NotI restriction site. 

SEQ ID NO: 38) is an Oligonucleotide containing S. cerevisiae DNA. 
25 SEQ ID NO: 39) is an Oligonucleotide containing A thaliana DNA. 

SEQ ID NO: 40) is an Oligonucleotide containing S. cerevisiae DNA. 

SEQ ID NO: 41) is an Oligonucleotide containing R. capsulatus DNA. 

SEQ ID NO: 42) is an Oligonucleotide containing S. cerevisiae DNA. 

SEQ ID NO: 43) is an Oligonucleotide containing S. pombe DNA. 
30 SEQ ID NO: 44) is an Oligonucleotide containing R. capsulatus DNA. 

SEQ ID NO: 45) is an Oligonucleotide containing R. capsulatus DNA. . 

SEQ ID NO: 46) is an Oligonucleotide containing S. pombe DNA. 



17 



WO 02/10398 



PCT/US01/24037 



SEQ ID NO: 47) is an Oligonucleotide containing S. pombe DNA. 
SEQ ID NO: 48) is Saccharomyces cerevisiae orf for phosphomevalonate kinase 
(ERGS). 

SEQ ID NO: 49) Saccharomyces cerevisiae orf for mevalonate kinase (ERG12). 
5 SEQ ID NO: 50) Saccharomyces cerevisiae orf for mevalonate diphosphate 

decarboxylase (ERG19). 

SEQ ID NO: 51) Saccharomyces cerevisiae orf for acetoacetyl thiolase. 

SEQ ID NO: 52) Arabidopsis thaliana orf for 
3-hydroxy-3-methylglutaryl-coenzyme A (HMG-CoA) synthase. 
1 0 SEQ ID NO: 53) Arabidopsis thaliana orf for HMG-CoA reductase. 

SEQ ID NO: 54) Schizosaccharomyces pombe IDI1 (IPP isomerase). 

SEQ ID NO: 55) Rhodobacter capsulatus idiB (IPP isomerase). 

SEQ ID NO: 56) Streptomyces sp CL190 orf encoding HMG-CoA reductase. 

SEQ ID NO: 57) Streptomyces sp CL190 gene cluster containing mevalonate 
1 5 pathway and IPP isomerase orfs. 

SEQ ID NO: 58) Operon A containing^, thaliana and S. cerevisiae DNA 

SEQ ID NO: 59) is Operon B containing A. thaliana and S. cerevisiae DNA. 

SEQ ID NO: 60) is Operon C containing A. thaliana, S. cerevisiae, and R. 
capsulatus DNA. 

20 SEQ ID NO: 61) is Operon D containing A. thaliana, S. cerevisiae, and 

Streptomycs sp CL190 DNA. 

SEQ ID NO: 62) is Operon E containing A thaliana, S. cerevisiae, Streptomycs 
sp CL190 DNA, and R. capsulatus DNA. 

SEQ ID NO: 63) is Operon F containing containing S. cerevisiae and Streptomycs 
25 jpCL190DNA. 

SEQ ID NO: 64) is Operon G containing^, thaliana, S. cerevisiae and S. pombe 

DNA. 

SEQ ID NO: 65) is PGR primer containing R. capsulatus DNA. 
SEQ ID NO: 66) is PCR primer containing R. capsulatus DNA. 
30 SEQ ID NO: 67) is an Oligonucleotide containing N. tabacum and R. capsulatus 

DNA. 

SEQ ID NO: 68) is an Oligonucleotide containing N. tabacum and R. capsulatus 
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DNA. 

SEQ ID NO: 69) is an Oligonucleotide containing N. tabacum and S. cerevisiae 

DNA. 

SEQ ID NO: 70) is an Oligonucleotide containing K tabacum and R. capsulatus 

5 DNA. 

SEQ ID NO: 71) is Rhodobacter capsulatus orf encoding phytoene synthase 

(crtB). 

SEQ ID NO: 72) is plastid transformation vector pHK04, containing Operon B, 
containing A. thaliana and S. cerevisiae DNA. 
0 SEQ ID NO: 73) is plastid transformation vector pHK07, containing Operon C, 

containing A. thaliana, S. cerevisiae, and R. capsulatus DNA. 

SEQ ID NO: 74) is plastid transformation vector pHK08, containing Operon G, 
containing A. thaliana, S. cerevisiae, and S. pombe DNA. 

SEQ ID NO: 75) is plastid transformation vector pFHOS containing R. capsulatus 
5 DNA encoding phytoene synthase. 

SEQ ID NO: 76) is plastid transformation vector pFH06, containing Operon E, 
containing A. thaliana, S. cerevisiae, Streptomycs sp CL190 DNA, and Rxapsulatus 
DNA. 

0 Detailed Description 

In the description that follows, a number of terms used in genetic engineering are 
utilized. In order to provide a clear and consistent understanding of the specification and 
claims, including the scope to be given such terms, the following definitions are 
provided. 

5 A protein is considered an isolated protein if it is a protein isolated from a host 

cell in which it is naturally produced. It can be purified or it can simply be free of other 
proteins and biological materials with which it is associated in nature, for example, if it 
is recombinantly produced. 

" An isolated nucleic acid is a nucleic acid the structure of which is not identical 

) to that of any naturally occurring nucleic acid or to that of any fragment of a naturally 
occurring genomic nucleic acid spanning more than three separate genes. The term 
therefore covers, for example, (a) a DNA which has the sequence of part of a naturally 



WO 02/10398 



PCT/US01/24037 



occurring genomic DNA molecule, but is not flanked by both of the coding or noncoding 
sequences that flank that part of the molecule in the genome of the organism in which it 
naturally occurs; (b) a nucleic acid incorporated into a vector or into the genomic or 
plastomic DNA of a prokaryote or eukaryote in a manner such that the resulting molecule 
5 is not identical to any naturally occurring vector or genomic or plastomic DNA; (c) a 
separate molecule such as a cDNA, a genomic or plastomic fragment, a fragment 
produced by polymerase chain reaction (PCR), or a restriction fragment; and (d) a 
recombinant nucleotide sequence that is part of a hybrid gene, i.e., a gene encoding a 
fusion protein. Specifically excluded from this definition are nucleic acids present in 

10 mixtures of (i) DNA molecules, (ii) transfected cells, and (iii) cell clones, e.g., as these 
occur in a DNA library such as a cDNA or genomic DNA library. 

One DNA portion or sequence is downstream of second DNA portion or sequence 
when it is located 3' of the second sequence. One DNA portion or sequence is upstream 
of a second DNA portion or sequence when it is located 5 ! of that sequence. 

1 5 One DNA molecule or sequence and another are heterologous to one another if 

the two are not derived from the same ultimate natural source, or are not naturally 
contiguous to each other. The sequences may be natural sequences, or at least one 
sequence can be derived from two different species or one sequence can be produced by 
chemical synthesis provided that the nucleotide sequence of the synthesized portion was 

20 not derived from the same organism as the other sequence. 

A polynucleotide is said to encode a polypeptide if, in its native state or when 
manipulated by methods known to those skilled in the art, it can be transcribed and/or 
translated to produce the polypeptide or a fragment thereof. The anti-sense strand of such 
a polynucleotide is also said to encode the sequence. 

25 A nucleotide sequence is operably linked when it is placed into a functional 

relationship with another nucleotide sequence. For instance, a promoter is operably 
linked to a coding sequence if the promoter effects its transcription or expression. 
Generally, operably linked means that the sequences being linked are contiguous and, 
where necessary to join two protein coding regions, contiguous and in reading frame. 

30 However, it is well known that certain genetic elements, such as enhancers, may be 
operably linked even at a distance, i.e., even if not contiguous. 

In a plastome, sequences are physically linked by virtue of the chromosome 

r 
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configuration, but they are not necessarily operably linked due to differential expression 
for example. Transgenes can be physically linked prior to transformation, or can become 
physically linked once they insert into a plastome. Transgenes can become operably 
linked if they share regulatory sequences upon insertion into a plastome. 
5 The term recombinant polynucleotide refers to a polynucleotide which is made 

by the combination of two otherwise separated segments of sequence accomplished by 
the artificial manipulation of isolated segments of polynucleotides by genetic engineering 
techniques or by chemical synthesis. In so doing one may join together polynucleotide 
segments of desired functions to generate a desired combination of functions. 

10 The polynucleotides may also be produced by chemical synthesis, e.g., by the 

phosphoramidite method described by Beaucage and Caruthers (1981) Tefra. Letts., 
22:1859-1862 or the triester method according to Matteuci et al (1981) J. Am. Chem. 
Soc, 103: 3185, and may be performed on commercial automated oligonucleotide 
synthesizers. A double-stranded fragment may be obtained from the single stranded 

15 product of chemical synthesis either by synthesizing the complementary strand and 
annealing the strands together under appropriate conditions or by adding the 
complementary strand using DNA polymerase with an appropriate primer sequence. 

Polynucleotide constructs prepared for introduction into a prokaryotic or 
eukaryotic host will typically, but not always, comprise a replication system (i.e. vector) 

20 recognized by the host, including the intended polynucleotide fragment encoding the 
desired polypeptide, and will preferably, but not necessarily, also include transcription 
and translational initiation regulatory sequences operably linked to the polypeptide- 
encoding segment. Expression systems (expression vectors) may include, for example, 
an origin of replication or autonomously replicating sequence (ARS) and expression 

25 control sequences, a promoter, an enhancer and necessary processing information sites, 
such as ribosome-binding sites, RNA splice sites, polyadenylation sites, transcriptional 
terminator sequences, and mRNA stabilizing sequences. Signal peptides may also be 
included where appropriate, preferably from secreted polypeptides of the same or related 
species, which allow the protein to cross and/or lodge in cell membranes or be secreted 

30 from the cell. 

Variants or sequences having substantial identity or homology with the 
polynucleotides encoding enzymes of the mevalonate pathway may be utilized in the 
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practice of the invention. Such sequences can be referred to as variants or modified 
sequences. That is, a polynucleotide sequence may be modified yet still retain the ability 
to encode a polypeptide exhibiting the desired activity. Such variants or modified 
sequences are thus equivalents. Generally, the variant or modified sequence will 
5 comprise at least about 40%-60%, preferably about 60%-80%, more preferably about 
80%-90%, and even more preferably about 90%-95% sequence identity with the native 
sequence. 

Sequence relationships between two or more nucleic acids or polynucleotides are 
generally defined as sequence identity, percentage of sequence identity, and substantial 

10 identity. See, for example, "Pedestrian Guide to Analyzing Sequence Data Bases" at 
www.embl-heidelberg.de/-schneide/paper/springer96/springer.html . In determining 
sequence identity, a "reference sequence" is used as a basis for sequence comparison. The 
reference may be a subset or the entirety of a specified sequence. That is, the reference 
sequence may be a full-length gene sequence or a segment of the gene sequence. 

15 Methods for alignment of sequences for comparison are well known in the art. 

See, for example, Smith et al (1981) Adv. Appl. Math. 2:482; Needleman et al (1970) 
J. Mol Biol 48:443; Pearson et al (1988) Proa Natl Acad Sci. 85:2444; CLUSTAL in 
the PC/Gene Program by Intelligenetics, Mountain View, California; GAP, BESTFIT, 
BLAST, FASTA, and TFASTA in the Wisconsin Genetics Software Package, Genetics 

20 Computer Group (GCG), 575 Science Drive, Madison, Wisconsin, USA. Preferred 
computer alignment methods also include the BLASTP, BLASTN, and BLASTX 
algorithms. See, Altschul etal (1990)7. Mol Biol 215:403-410. 

"Sequence identity" or "identity" in the context of nucleic acid or polypeptide 
sequences refers to the nucleic acid bases or residues in the two sequences that are the 

25 same when aligned for maximum correspondence over a specified comparison window. 
"Percentage of sequence identity" refers to the value determined by comparing two 
optimally aligned sequences over a comparison window, wherein the portion of the 
polynucleotide sequence in the comparison window may comprise additions or deletions 
as compared to the reference window for optimal alignment of the two sequences. The 

30 percentage is calculated by determining the number of positions at which the identical 
nucleic acid base or amino acid residue occurs in both sequences to yield the number of 
matched positions, dividing the number of matched positions by the total number of 
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positions in the window of comparison, and multiplying the result by 100 to yield the 
percentage of sequence identity. 

Polynucleotide sequences having "substantial identity" are those sequences having 
at least about 50%-60% sequence identity, generally at least 70% sequence identity, 
5 preferably at least 80%, more preferably at least 90%, and most preferably at least 95%, 
compared to a reference sequence using one of the alignment programs described above. 
Preferably sequence identity is determined using the default parameters determined by 
the program. Substantial identity of amino acid sequence generally means sequence 
identity of at least 50%, more preferably at least 70%, 80%, 90%, and most preferably at 
10 least 95%. 

Nucleotide sequences are generally substantially identical if the two molecules 
hybridize to each other under stringent conditions. Generally, stringent conditions are 
selected to be about 5°C lower than the thermal melting point for the specific sequence 
at a defined ionic strength and pH. Nucleic acid molecules that do not hybridize to each 
15 other under stringent conditions may still be substantially identical if the polypeptides 
they encode are substantially identical. This may occur, for example, when a copy of a 
nucleic acid is created using the maximum codon degeneracy permitted by the genetic 
code. 

As noted, hybridization of sequences may be carried out under stringent 
20 conditions. By "stringent conditions" is intended conditions under which a probe will 
hybridize to its target sequence to a detectably greater degree than to other sequences. 
Stringent conditions are sequence-dependent and will be different in different 
circumstances. Typically, stringent conditions will be those in which the salt 
concentration is less than about 1.5 M Na ion, typically about 0:01 to 1.0 M Na ion 
25 concentration (or other salts) at pH 7.0 to 8.3 and the temperature is at least about 30 ° C 
for short probes (e.g., 10 to 50 nucleotides) and at least about 60° C for long probes (e.g., 
greater than 50 nucleotides). Stringent conditions may also be achieved with the addition 
of destabilizing agents such as formamide. Exemplary stringent conditions include 
hybridization with a buffer solution of 30 to 35% formamide, 1.0 M NaCl, 1% SDS 
30 (sodium dodecyl sulphate) at 37° C, and a wash in IX to 2X SSC (20X SSC = 3.0 M 
NaCl/0.3 M trisodium citrate) at 50 to 55° C. It is recognized that the temperature, salt, 
and wash conditions may be altered to increase or decrease stringency conditions. For 
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the post-hybridization washes, the critical factors are the ionic strength and temperature 
of the final wash solution. See, Meinkoth and Wahl (1984) Anal. Biochem. 138:267-284. 

As indicated, fragments and variants of the nucleotide sequences of the invention 
are encompassed herein. By "fragment" is intended a portion of the nucleotide sequence. 
5 Fragments of the polynucleotide sequence will generally encode polypeptides which 
retain the biological/enzymatic activity of the native protein. Those of skill in the art 
routinely generate fragments of polynucleotides of interest through use of commercially 
available restriction enzymes; synthetic construction of desired polynucleotides based on 
known sequences; or use of "erase-a-base" technologies such as Bal 31 exohuclease, by 

1 0 which the skilled artisan can generate hundreds of fragments of a known polynucleotide 
sequence from along the entire length of the molecule by time-controlled, limited 
digestion. Fragments that retain at least one biological or enzymatic activity of the native 
protein are equivalents of the native protein for that activity. 

By "variants" is intended substantially similar sequences. For example, for 

1 5 nucleotide sequences, conservative variants include those sequences that, because of the 
degeneracy of the genetic code, encode the amino acid sequence of an enzyme of the 
mevalonate pathway. Variant nucleotide sequences include synthetically derived 
sequences, such as those generated for example, using site-directed mutagenesis. 
Generally, nucleotide sequence variants of the invention will have at least 40%, 50%, 

20 60%, 70%, generally 80%, preferably 85%, 90%, up to 95% sequence identity to its 
respective native nucleotide sequence. Activity of polypeptides encoded by fragments 
or variants of polynucleotides can be confirmed by assays disclosed herein. 

"Variant" in the context of proteins is intended to mean a protein derived from the 
native protein by deletion or addition of one or more amino acids to the N-terminal and/or 

25 C-terminal end of the native protein; deletion or addition of one or more amino acids at 
one or more sites in the native protein; or substitution of one or more amino acids at one 
or more sites in the native protein. Such variants may result from, for example, genetic 
polymorphism or human manipulation. Conservative amino acid substitutions will 
generally result in variants that retain biological function. Such variants are equivalents 

30 . of the native protein. Variant proteins that retain a desired biological activity are 
encompassed within the subject invention. Variant proteins of the invention may include 
those that are altered in various ways including amino acid substitutions, deletions, 
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truncations, and insertions. Methods for sucK manipulation are generally known in the 
art. See, for example, Kunkel (1985) Proc. Natl Acad. Set USA 82:488-492; Kunkel et 
al (1987) Methods andEnzymol;. 154:367-382; and the references cited therein. 
\ An expression cassette may contain at least one polynucleotide of interest to be 

5 cotransfbrmed into the organism. Such an expression cassette is preferably provided with 
a plurality of restriction sites for insertion of the sequences of the invention to be under 
\ the transcriptional regulation of the regulatory regions. The expression cassette may 

| additionally contain selectable marker genes. 

The cassette may include 5* and 3' regulatory sequences operably linked to a 
1 0 polynucleotide of interest. By "operably linked" is intended, for example, a functional 
linkage between a promoter and a second sequence, wherein the promoter sequence 
initiates and mediates transcription of the DNA sequence corresponding to the second 
sequence. Generally, operably linked means that the nucleic acid sequences being linked 
are contiguous and, where necessary to join two protein coding regions, contiguous and 
15 in the same reading frame. When a polynucleotide comprises a plurality of coding 
regions that are operably linked such that they are under the control of a single promoter, 
the polynucleotide may be referred to as an "operon" 

The expression cassette will optionally include in the 5-3 f direction of 
transcription, a transcriptional and translational initiation region, a polynucleotide 
20 sequence of interest and a transcriptional and translational termination region functional 
in plants or microalgae. The transcriptional initiation region, the promoter, is optional, 
but may be native or analogous, or foreign or heterologous, to the intended host. 
Additionally, the promoter may be the natural sequence or alternatively a synthetic 
sequence. By "foreign" is intended that the transcriptional initiation region is not found 
25 in the native organism into which the transcriptional initiation region is introduced. As 
used herein, a chimeric gene comprises a coding sequence operably linked to a 
transcriptional initiation region that is heterologous to the coding sequence. 

The termination region may be native with the transcriptional initiation region, 
may be native with the operably linked DNA sequence of interest, or may be derived 
30 from another source. Convenient termination regions are available from the Ti-plasmid 
of A. tumefaciens, such as the octopine synthase and nopaline synthase termination 
regions. See also Guerineau et al. (1991) Mol. Gen. Genet. 262:141-144; Proudfopt 
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(1991) Cell 64:671-674; Sanfacon et al (1991) Genes Dev. 5:141-149; Mogen et al 
(1990) Plant Cell 2:1261-1272; Munroe et al (1990) Gene 91:151-158; Ballas et al 
(1989) Nucleic Acids Res. 17:7891-7903; and Joshi et al (1987) Nucleic Acid Res. 
15:9627-9639. 

5 Where appropriate, the polynucleotides of interest may be optimized for 

expression in the transformed organism. That is, the genes can be synthesized using plant 
or algae plastid-preferred codons corresponding to the plastids of the plant or algae of 
interest. Methods are available in the art for synthesizing such codon optimized 
polynucleotides. See, for example, U. S. Patent Nos. 5,380,831 and 5,436,391, and 
10 Murray et al (1989) Nucleic Acids Res. 17:477-498, herein incorporated by reference. 
Of course, the skilled artisan will appreciate that for the transplastomic purposes 
described herein, sequence optimization should be conducted with plastid codon usage 
frequency in mind, rather than the plant or algae genome codon usage exemplified in 
these references. 

15 It is now well known in the art that when synthesizing a polynucleotide of interest 

for improved expression in a host cell it is desirable to design the gene such that its 
frequency of codon usage approaches the frequency of codon usage of the host cell. It 
is also well known that plastome codon usage may vary from that of the host plant or 
microalgae genome. For purposes of the subject invention, "frequency of preferred codon 

20 usage" refers to the preference exhibited by a specific host cell plastid in usage of 
nucleotide codons to specify a given amino acid. To determine the frequency of usage 
of a particular codon in a gene, the number of occurrences of that codon in the gene is 
divided by the total number of occurrences of all codons specifying the same amino acid 
in the gene. Similarly, the frequency of preferred codon usage exhibited by a plastid can 

25 be calculated by averaging frequency of preferred codon usage in a number of genes 
expressed by the plastid. . It usually is preferable that this analysis be limited to genes that 
are among those more highly expressed by the plastid. Alternatively, the polynucleotide 
of interest may be synthesized to have a greater number of the host plastid's most 
preferred codon for each amino acid, or to reduce the number of codons that are rarely 

30 used by the host. 

The expression cassettes may additionally contain 5* leader sequences in the 

expression cassette construct. Such leader sequences can act to enhance translation. 

* 
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Translation leaders are known in the art and include: picornavirus leaders, for example, 
EMCV leader (Encephalomyocarditis 5 ! noncoding region), Elroy-Stein et al (1989) 
PNASUSA 86:6126-6130; poty virus leaders, for example, TEV leader (Tobacco Etch 
Virus), Allison et al (1986); MDMV Leader (Maize Dwarf Mosaic Virus) Virology 
5 1 54:9-20; and human immunoglobulin heavy-chain binding protein (BiP), Macejak et al 
(1991) Nature 353:90-94; untranslated leader from the coat protein mRNA of alfalfa 
mosaic virus (AMV RNA 4), Jobling et al (1987) Nature 325:622-625; tobacco mosaic 
virus leader (TMV), Gallie et al (1989) in Molecular Biology of RNA, ed. Cech (Liss, 
New York), pp. 237-256; and maize chlorotic mottle virus leader (MCMV), Lommel et 

10 al (1991) Virology 81:382-385. See also, Della-Cioppa et al (1987) Plant Physiol 
84:965-968. Other methods known to enhance translation can also be utilized, for 
example, introns, and the like. 

In preparing an expression cassette, the various polynucleotide fragments may be 
manipulated, so as to provide for the polynucleotide sequences in the proper orientation 

15 and, as appropriate, in the proper reading frame. Toward this end, adapters or linkers 
may be employed to join the polynucleotide fragments or other manipulations may be 
involved to provide for convenient restriction sites, removal of superfluous nucleotides, 
removal of restriction sites, or the like. For this purpose, in vitro mutagenesis, primer 
repair, restriction, annealing, resubstitutions, e.g., transitions and transversions, may be 

20 involved. 

In addition, expressed gene products may be localized to specific organelles in the 
target cell by ligating DNA or RNA coded for peptide leader sequences to the 
polynucleotide of interest. Such leader sequences can be obtained from several genes 
of either plant or other sources. These genes encode cytoplasmically-synthesized proteins 

25 directed to, for example, mitochondria (the Fl-ATPase beta subunit from yeast or 
tobacco, cytochrome cl from yeast), chloroplasts (cytochrome oxidase subunit Va from 
yeast, small subunit of rubisco from pea), endoplasmic reticulum lumen (protein disulfide 
isomerase), vacuole (carboxypeptidase Y and proteinase A from yeast, 
phytohemagglutininfrom French bean), peroxisomes (D-aminoacid oxidase, uricase) and 

30 lysosomes (hydrolases). 

Following transformation, a plant may be regenerated, e.g., from single cells, 
callus tissue, or leaf discs, as is standard in the art. Almost any plant can be entirely 
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regenerated from cells, tissues, and organs of the plant. Available techniques are 
reviewed in Vasil et al (1 984) in Cell Culture and Somatic Cell Genetics of Plants, Vols. 
I II, and III, Laboratory Procedures and Their Applications (Academic press); and 
Weissbach et al (1989) Methods for Plant Mol Biol 
5 The transformed plants may then be grown, and either pollinated with the same 

transformed strain or different strains, and the resulting hybrid having expression of the 
desired phenotypic characteristic identified. Two or more generations may be grown to 
ensure that expression of the desired phenotypic characteristic is stably maintained and 
inherited, and then seeds harvested to ensure expression of the desired phenotypic 
1 0 characteristic has been achieved. 

The particular choice of a transformation technology will be determined by its 
efficiency to transform certain target species, as well as the experience and preference of 
the person practicing the invention with a particular methodology of choice. It will be 
apparent to the skilled person that the particular choice of a transformation system to 
1 5 introduce nucleic acid into plant or microalgae plastids is not essential to or a limitation 
of the invention, nor is the choice of technique for plant regeneration. 

Also according to the invention, there is provided a plant or microalgae cell 
having the constructs of the invention. A further aspect of the present invention provides 
a method of making such a plant cell involving introduction of a. vector including the 
20 construct into a plant cell. For integration of the construct into the plastid genome (the 
"plastome), such introduction will be followed by recombination between the vector and 
the plastome genome to introduce the operon sequence of nucleotides into the plastome. 
RNA encoded by the introduced nucleic acid construct (operon) may then be transcribed 
in the cell and descendants thereof, including cells in plants regenerated from transformed 
25 material. A gene stably incorporated into the plastome of a plant or microalgae is passed 
from generation to generation to descendants of the plant or microalgae, so such 
descendants should show the desired phenotype. 

The present invention also provides a plant or microalgae culture comprising a 
plant cell as disclosed. Transformed seeds and plant parts are also encompassed. As 
30 used herein, the expressions "cell," "cell line," and "cell culture" are used 
interchangeably and all such designations include progeny, meaning descendants, not 
limited to the immediate generation of descendants but including all generations of 
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descendants. Thus, the words "transformants" and "transformed cells" include the 
primary subject cell and cultures derived therefrom without regard for the number of 
transfers. It is also understood that all progeny may not be precisely identical in DNA 
content, due to naturally occurring, deliberate, or inadvertent caused mutations. Mutant 
5 progeny that have the same function or biological activity as screened for in the originally 
transformed cell are included. Where distinct designations are intended, it will be clear 
from the context. 

In addition to a plant or microalgae, the present invention provides any clone of 
such a plant or microalgae, seed, selfed or hybrid or mated descendants, and any part of 

10 any of these, such as cuttings or seed for plants. The invention provides any plant 
propagule, that is any part which may be used in reproduction or propagation, sexual or 
asexual, including cuttings, seed, and so on. Also encompassed by the invention is a 
plant or microalgae which is a sexually or asexually propagated off-spring, clone, or 
descendant of such a plant or microalgae, or any part or propagule of said plant, off- 

15 spring, clone, or descendant. Plant or microalgae extracts and derivatives are also 
provided. 

The present invention may be used for transformation of any plant species, 
including, but not limited to, corn (Zea mays), canola {Brassica napus, Brassica rapa 
ssp.), alfalfa {Medicago sativa), rice {Oiyza sativa), rye {Secale cereale), sorghum 

20 {Sorghum bicolor, Sorghum vulgare), sunflower {Helianthus animus), wheat {Triticum 
aestivum), soybean {Glycine max), tobacco {Nicotiana tabacum), potato {Solanum 
tuberosum), peanuts {Arachis hypogaea), cotton {Gossypium hirsutum), sweet potato 
{Ipomoea batatus), cassava {Manihot esculenta), coffee {Cofea ssp.), coconut {Cocos 
nucifera), pineapple {Ananas comosus), citrus trees {Citrus spp.), cocoa {Theobroma 

25 cacao), tea {Camellia sinensis), banana {Musa spp.), avocado {Persea americana), fig 
{Ficus casica), guava {Psidium guajava), mango {Mangifera indica), olive {Olea 
europaea), papaya {Carica papaya), cashew {Anacardium occidental), macadamia 
{Macadamia integrifolia), almond {Prunus amygdalus), sugar beets {Beta vulgaris), oats, 
barley, vegetables, ornamentals, and conifers. 

30 Preferably, plants of the present invention are crop plants (for example, cereals 

and pulses, maize, wheat, potatoes, tapioca, rice, sorghum, millet, cassava, barley, pea, 
and other root, tuber, or seed crops. Important seed crops are oil-seed rape, sugar beet, 
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maize, sunflower, soybean, and sorghum. Horticultural plants to which the present 
invention may be applied may include lettuce; endive; and vegetable brassicas including 
cabbage, broccoli, and cauliflower; and carnations and geraniums. The present invention 
may be applied to tobacco, cucurbits, carrot, strawberry, sunflower, tomato, pepper, 
5 chrysanthemum, petunia, rose, poplar, eucalyptus, and pine. 

Grain plants that provide seeds of interest include oil-seed plants and leguminous 
plants. Seeds of interest include grain seeds, such as corn, wheat, barley, rice, sorghum, 
rye, etc. Oil seed plants include cotton, soybean, safflower, sunflower, Brassica, maize, 
alfalfa, palm, coconut, etc. Leguminous plants include beans and peas. Beans including 
1 0 guar, locust bean, fenugreek, soybean, garden beans, cowpea, mungbean, lima bean, fava 
bean, lentils, chickpea, etc. 

Microalgae include but are not limited to the Chlorophyta and the Rhodophyta 
and may be such organisms as Chlamydomonas, Haematococcus, and Ouneliella. 

Other features and advantages of the present invention will become apparent from 
15 the following detailed description. It should be understood, however, that the detailed 
description and the specific examples, while indicating preferred embodiments of the 
invention, are given by way of illustration only, since various changes and modifications 
within the spirit and scope of the invention will become apparent to those skilled in the 
art from this detailed description. Unless indicated otherwise, the respective contents of 
: 20 the documents cited herein are hereby incorporated by reference to the extent they are not 
inconsistent with the teachings of this specification. 

Percentages and ratios given herein are by weight, and temperatures are in degrees 
Celsius unless otherwise indicated. The references cited within this application are herein 
incorporated by reference to the extent applicable. Where necessary to better exemplify 
25 the invention, percentages and ratios may be cross-combined. 

Example 1: Isolation of Orfs Encoding Enzymes of the Mevalonate Pathway for the 
Construcion of Vectors pFCOl andpFC02 

In an exemplified embodiment, vectors containing open reading frames (orfs) 
30 encoding enzymes of the mevalonate pathway are constructed. Polynucleotides derived 
from the yeast Saccharomyces cerevisiae, the plant Arabidopsis thaliana, and the 
eubacterium Streptomyces sp CL190 are used for the construction of vectors, including 

T 
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plastid delivery vehicles, containing orfs for biosynthesis of the mevalonate pathway 
enzymes. Construction of the vectors is not limited to the methods described. It is 
routine for one skilled in the art to choose alternative restriction sites, PCR primers, etc. 
to create analogous plasmids containing the same orfs or other orfs encoding the enzymes 
5 of the mevalonate pathway. Many of the steps in the construction of the plasmids of the 
subject invention can utilize the joining of blunt-end DNA fragments by ligation. As 
orientation with respect to the promoter upstream (5') of the described orfs can be critical 
for biosynthesis of the encoded polypeptides, restriction analysis is used to determine the 
orientation in all instances involving blunt-end ligations. A novel directional ligation 
1 0 methodology, chain reaction cloning (Pachuk et al., Gene 243 : 1 9-25, 2000), can also be 
used as an alternative to standard ligations in which the resultant orientation of the insert 
is not fixed. All PCR products are evaluated by sequence analysis as is well known in 
the art. 

The construction of a synthetic operon comprising three yeast orfs encoding 
15 phosphomevalonate kinase, mevalonate kinase, and mevalonate diphosphate 
decarboxylase is described by Hahn et al. (Hahn et al., J. Bacteriol. 183:1-11, 2001). 
This same synthetic operon, contained within plasmid pFC02, is able to synthesize, in 
vivo, polypeptides with enzymatic activities able to convert exogenously supplied 
mevalonate to IPP as demonstrated by the ability of the mevalonate pathway orfs to 
20 complement the temperature sensitive dxs::kanr lethal mutation in E. coli strain FH1 1 
(Hahn et al., 2001). 

Plasmids pFCOl and pFC02 containing a synthetic operon for the biosynthesis 
of IPP from mevalonate are constructed as follows: Three yeast orfs encoding mevalonate 
kinase, phosphomevalonate kinase, and mevalonate diphosphate decarboxylase are 
25 isolated from S. cerevisiae genomic DNA by PCR using the respective primer sets 

FH0129-2: 

5' GGACTAGTCTGCAGGAGGAGTTTTAATGTCATTAGCGTTCTTAAC 
TTCTGCACCGGG-3' (sense) (SEQ ID NO: 1) and 

30 

FH0129-1: 

5' TTCTCGAGCTTAAGAGTAGCAATATTTACCGGAGCAGTTACACTA 
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GCAGTATA Z4C4(?rCATTAAAACTCCTCCTGTGAAGTCCATGGTAAATTCG 3* 
(antisense) (SEQ IDNO:2); 

FH0211-1: 

5 5' T AGCGGCCGCA GGAGGAGTTCATATGTCAGAGTTGAGAGCCTTC 
AGTGCCCCAGGG 3' (sense) (SEQ ID NO: 3) and 

FH0211-2: 

5' TT TCTGCAGT TTATCAAGATAAGTTTCCGGATCTTT 3' (antisense) (SEQ ID 
10 NO: 4); 

CT0419-1: 

5' GGAATTCATGACCGTTTACACAGCATCCGTTACCGCACCCG 3' (sense) (SEQ 
IDNO:5)and 

15 

CT0419-2: 

5' GGCTCGAGTTAAAACTCCTCTTCCTTTGGTAGACCAGTCTTTGCG 3" 
(antisense) (SEQ ID NO: 6). 

20 Primer FH0129-2 includes a Spel site (underlined). Primer FH0129-1 contains an Xhol 
site (underlined), an Aflll site (double-underlined), and 54 nucleotides (bold italics) 
corresponding to the 5' end of the yeast orf for mevalonate diphosphate decarboxylase. 
Following PCR using primers FH0129-1 and FH0129-2, a product containing the orf 
encoding yeast mevalonate kinase is isolated by agarose gel electrophoresis and 

25 GeneClean purified. Following restriction with Spel-Xhol, the PCR product is inserted 
into the Spel-Xhol sites of pBluescript(SK+) (Stratagene, LaJolla, CA) by ligation to 
create pBRG12. Primers FH021 1-1 and FH021 1-2 contain a NotI site (underlined) and 
a PstI site (underlined), respectively. Following PCR using primers FH0211-1 and 
FH021 1-2, a product containing the orf encoding yeast phosphomevalonate kinase is 

30 restricted with Notl-PstI, purified by GeneClean, and inserted into pGEM-T Easy 
(Promega Corp, Madison, WI ) by ligation to create pERG8. An orf encoding yeast 
mevalonate diphosphate decarboxylase is isolated by PCR using primers CT0419-1 and 
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CT0419-2 and inserted directly into pGEM-T Easy by ligation to create pERG19. 
Restriction of pERG8 with Notl-PstI yields a 1.4 Kb DNA fragment containing the orf 
for phosphomevalonate kinase. Restriction of pBRG12 with Notl-PstI is followed by the 
insertion of the 1.4 Kb Notl-PstI DNA fragment by ligation to create pBRG812 
5 containing the orfs for both phosphomevalonate kinase and mevalonate kinase and the 
5' end of the orf for yeast mevalonate diphosphate decarboxylase. Restriction of pERG19 
with Aflll-Xhol yields a 1 .2 Kb DNA fragment containing the 3' end of the orf for yeast 
mevalonate diphosphate decarboxylase missing in pBRG812. Insertion of the 1.2 Kb 
Aflll-Xhol DNA fragment into pBRG812/AflII-XhoI by ligation yields pFCOl 

1 0 containing the three yeast mevalonate pathway orfs (Fig. 1). Restriction of pFCOl with 
Xhol is followed by treatment with the Klenow fragment of T7 DNA polymerase and 
dNTPs to create blunt ends. Subsequent restriction of pFCOl/XhoI/Klenow with Sad 
yields a 3.9 Kb DNA fragment containing the three yeast mevalonate pathway orfs. 
Following agarose gel electrophoresis and GeneClean purification of the 3.9 Kb DNA 

15 fragment, it is inserted into the Smal-SacI sites of pNGHl-amp (Garrett et al., J. Biol. 
Chem. 273:12457-12465, 1998) by ligation to create pFC02 (Fig. 2). 

Example 2: Construction of E. coli strain FH1 1 (JM101/dxs::kan r /pDX4) 

A mutant E. coli strain containing a disruption of the chromosomal dxs gene is 

20 constructed as described by Hamilton et al. (Hamilton et al., J. Bacteriol. 171 :46 17-4622, 
1989). The strains are grown at 30° C or 44° C in Luria-Bertani (LB) supplemented with 
the following antibiotics as necessary; ampicillin (Amp) (50 (g/ml), chloramphenicol 
(Cam) (30 (g/ml), and kanamycin (Kan) (25 (g/ml). Within phagemid DD92 (F. R. 
Blattner, University of Wisconsin, Madison, WI) is a 19.8 Kb EcoRI fragment of E. coli 

25 genomic DNA containing dxs, the gene for DXP synthase. Following the isolation of the 
phage from E. coli strain LE392, DD92 is restricted with SphI, and the resultant 6.3 Kb 
fragment is isolated by agarose gel electrophoresis. GeneClean purification of the SphI 
fragment and restriction with Smal yields a 2.0 Kb Sphl-Smal fragment containing E. 
coli dxs. The 2.0 Kb fragment is purified by GeneClean and inserted by ligation into the 

30 Sphl-Hindll sites of pMAK705, a plasmid containing a temperature-sensitive origin of 
replication (Hamilton et al., J. Bacteriol. 171:4617-4622, 1989), The resulting plasmid 
containing wt dxs, pDX4, is restricted with Sapl, a unique site located in the middle Of 
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the dxs gene, and the 5-overhangs are filled in with Klenow and dNTPs. The 
blunt-ended DNA fragment is purified by GeneClean and treated with shrimp alkaline 
phosphatase (SAP, USB Corp., Cleveland, OH) according to the manufacturer's 
instructions. pUC4K (Amersham Pharmacia Biotech, Piscataway, NJ) is restricted with 
5 EcoRI, Klenow-treated, and the resulting 1 .3 Kb blunt-ended DNA fragment containing 
the gene for Kan resistance is inserted into the filled-in Sapl site of pDX4 by blunt-end 
ligation to create pDX5 with a disruption in E. coli dxs. Competent R coli JM101 cells 
are transformed with pDX5, a pMAK705 derivative containing dxs::kanr, and grown to 
an optical density (A600) of 0.6 at 30° C. Approximately 10,000 cells are plated out on 

1 0 LB/Cam medium prewarmed to 44° C. The plates were incubated at 44° C, and several 
of the resulting colonies are grown at 44° C in 4 ml of LB/Cam medium. Four 50 ml 
LB/Cam cultures are started with 0.5 ml from four of the 4 ml cultures and grown 
overnight at 30° C. Four fresh 50 ml LB/Cam cultures are started with 100 |lx1 of the 
previous cultures and grown overnight at 30° C. An aliquot of one of the 50 ml cultures 

15 is serially diluted 5 x 105 fold, and 5 fxl is plated on LB/Cam medium. Following 
incubation at 30° C, the resulting colonies are used to^ individually inoculate 3 ml of LB 
medium containing Cam and Kan. Twelve LB/Cam/Kan cultures are grown overnight 
at 30° C and used for plasmid DNA isolation. K coli cells where the disrupted copy of 
dxs is incorporated into the genome are identified by restriction analysis of the isolated 

20 plasmid DNA and verified by sequence analysis of the DNA contained in the plasmids. 
The E. coli JM101 derivative containing the dxsnkanr mutation is designated FH11 
(KahnetaL 2001). 

Example 3 : Assay Demonstrating Synthesis of IPP from Mevalonic Acid in E. coli 
25 The episomal copy of dxs contained on pDX4 in E. coli strain FH1 1 is "turned 

off' at 44° C due to a temperature sensitive origin of replication on the pMAK705 
derivative (Hamilton et al., J. Bacteriol. 171 :4617-4622, 1989). The inability of FH1 1 
to grow at the restrictive temperature demonstrates that dxs is an essential single copy 
gene in E. coli (Halin et al., 2001). A cassette containing three yeast mevalonate pathway 
30 orfs is removed from pFCOl and inserted into pNGHl -Amp to form pFC02 for testing 
the ability of the mevalonate pathway orfs to complement the dxs::kanr disruption when 
FH1 1 is grown at 44° C on medium containing mevalonate. The utility of strain FH1 i 
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as a component of an assay for testing the ability of mevalonate pathway orfs to direct 
the synthesis of IPP is demonstrated as follows: 

Colonies of E. coli strain FH1 1 transformed with pFC02 or pNGHl-Amp, the 
expression vector without an insert, are isolated by incubation at 30° C on LB plates 
5 containing Kan and Amp. Four ml LB/Kan/Amp cultures containing either FH1 l/pFC02 
or FH1 1/pNGHl-Amp are grown overnight at 30° C. Following a 10,000-fold dilution, 
10 |il portions from the cultures are spread on LB/Kan/Amp plates that are prewarmed 
to 44° C or are at rt. Approximately L3 mg of mevalonic acid is spread on each plate 
used for FH1 l/pFC02. The prewarmed plates are incubated at 44° C, and the rt plates 

10 are incubated at 30° C overnight. 

FH1 1/pNGHl-amp cells will not grow at the restrictive temperature of 44° C and 
FH1 l/pFC02 cells are unable to grow at of 44° C unless mevalonic acid (50 mg/L) is 
added to the growth medium thus establishing the ability of the polypeptides encoded by 
the mevalonate pathway orfs contained in the synthetic operon within pFC02 to form IPP 

1 5 from mevalonate in vivo (Hahn et at , 200 1 ). 

Example 4: Isolation of Mevalonate Pathway Orfs 

In a specific, exemplified embodiment, the isolation of orfs, each encoding a 
polypeptide with either HMG-CoA synthase enzyme activity, HMG-CoA reductase 

20 enzyme activity, or acetoacetyl-CoA thiolase enzyme activity, and construction of vectors 
containing these orfs is as follows: Synthesis of A. thaliana first strand cDNAs is 
performed utilizing PowerScript™( reverse transcriptase (Clontech Laboratories, Inc., 
Palo Alto, CA) according to the manufacturer's instructions. Specifically, a microfuge 
tube containing 5 [d of A. thaliana RNA (Arabidopsis Biological Resource Center, Ohio 

25 State University, Columbus, OH), 1.8 fil poly(dT)15 primer (0.28 (ig/fil, Integrated DNA 
Technologies, Inc.. Coralville, IA), and 6.2 \il DEPC-treated H20 is heated at 70°C for 
10 min and then immediately cooled on ice. The mixture is spun down by centrifugation 
and 4 nl of 5X First-Strand Buffer (Clontech), 2 Advantage UltraPure PCR dNTP 
mix (10 mM each, Clontech) and 2 |i(l 100 mM DTT are added and the entire contents 

30 mixed by pipetting. Following the addition of 1 reverse transcriptase (Clontech). and 
mixing by pipetting, the contents are incubated at 42° C for 90 min and then heated at 
70° C for 15 min to terminate the reaction. 
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The resulting A. thaliana first strand cDNAs are used as templates for the 
synthesis of an orf encoding HMG-CoA synthase and a truncated HMG-CoA reductase 
by PGR in a Perkin-Elmer GeneAmp PCR System 2400 thermal cycler utilizing the 
Advantage®-HF 2 PCR Kit (Clontech) according to the manufacturer's instructions. An 
5 A. thaliana HMG-CoA synthase orf is isolated using the following PCR primers: 

1) 5' GC TCTAGAT GCGCAGGAGGCACATATGGCGAAGAACGTTGGGATTTTG 
GCTATGGATATCTATTTCCC 3' (sense) (SEQ ID NO: 7); and 

2) 5' CGCrCGL4 GTCGA CGGATCCTCAGTGTCCATTGGCTACAGATCCATCTTC 
ACCTTTCTTGCC 3' (antisense) (SEQ ID NO: 8); 

1 0 containing the restriction site Xbal shown underlined, the restriction site XI10I shown in 
bold italic and the restriction site SaK shown double underlined. Specifically, 2 (1 cDNA, 
5 |u(l 10XHF2 PCR Buffer (Clontech), 5 j^l 10X HF 2 dNTP Mix (Clontech), 1 \i\ each 
of the primers described above, 1 jliI 50X Advantage-HF 2 Polymerase Mix (Clontech), 
and 35 |ul PCR-Grade H20 (Clontech) are combined in a 0.5 ml PCR tube. The mixture 

1 5 is heated at 94 ° C for 1 5 sec then subjected to 40 PCR cycles consisting of 1 5 sec at 94 0 
C and 4 min at 68 0 C. After a final incubation at 68 ° C for 3 min, the reaction is cooled 
to 4° C. Agarose gel electrophoresis is performed on a 10 jal aliquot to confirm the 
presence of a DNA fragment of the predicted size of 1 .4 Kb. The PCR is repeated in 
triplicate to generate enough product for its isolation by gel excision and purification by 

20 GeneClean (Qbiogene, Inc., Carlsbad CA). Following restriction with Xbal-Xhol and 
purification by GeneClean, the 1.4 Kb PCR product is inserted into the Xbal-Mol sites 
of pBluescript(SK+) by ligation to form putative pBSHMGS constructs. Sequence 
analysis of several of the candidate constructs is performed to identify inserts with DNA 
identical to the published A. thaliana orf for HMG-CoA synthase and are used for the 

25 construction of pBSHMGSR as described below. 

An A. thaliana orf encoding a polypeptide with HMG-CoA reductase enzyme 
activity is synthesized by PCR essentially as described above using the following 
primers: 

3) 5' CCG CTCGAG CACGTGGAGGCACATATGCAATGCTGTGAGATGCCT 
30 GTTGGATACATTCAGATTCCTGTTGGG 3' (sense) (SEQ ID NO: 9); and 

4) 5' GGGG7^CCTGCGGCCGGATCCCGGGTCATGTTGTTGTTGTTGTCGT 
TGTCGTTGCTCCAGAGATGTCTCGG 3 f (antisense) (SEQ ID NO: 10); 
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containing the restriction siteXhol shown underlined, the restriction site Kpnl shown in 
italic, the restriction site Eagl shown in bold, and the restriction site Smal shown double 
underlined. The 1 .1 Kb PCR product is isolated by agarose gel electrophoresis, purified 
by GeneClean and inserted into the pT7Blue-3 vector (Novagen, Inc., Madison, WI) 
5 using the Perfectly Blunt™ Cloning Kit (Novagen) according to the manufacturer's 
instructions. Sequence analysis is performed to identify constructs containing A. thaliana 
DNA encoding the desired C-terminal portion of the published HMG-CoA reductase 
amino acid sequence and are designated pHMGR. 

PCR is performed on S, cerevisiae genomic DNA (Invitrogen, Corp., Carlsbad, 
1 0 CA) by using the Advantage®-HF 2 PCR Kit (Clontech) according to the manufacturer's 
instructions and the following primers: 

5) 5' ACAACACCGCGGCGGCCGCGTCGAC TACGTA GGAGGCACATATGTC 
TCAGAACGTTTACATTGTATCGACTGCC 3' (sense) (SEQ ID NO: 11); and 

6) 5' GC TCTA GA GG ATCCTC ATATCTTTTC AATGAC AATAG AGG AAGC AC C 
15 ACCACC 3' (antisense) (SEQ ID NO: 12); 

containing the restriction site Notl shown underlined, the restriction site Stall sho.wn in 
italic, the restriction site Sail shown in bold, the restriction site SnaBI shown double 
underlined, and the restriction site Xbal in bold italic. The 1.2 Kb PCR product is isolated 
by agarose gel electrophoresis, purified by GeneClean and inserted into the vector 
20 pT7Blue-3 (Novagen,) using the Perfectly Blunt™ Cloning Kit (Novagen) according to 
the manufacturer's instructions. Sequence analysis is performed to identify constructs 
containing S. cerevisiae DNA identical to the published orf encoding acetoacetyl-CoA 
thiolase and they are designated pAACT. 

25 Example 5: Construction of pHKOl 

In an exemplified embodiment, a pBluescript(SK+) derivative containing an 
operon with orfs encoding polypeptides with enzymatic activities for HMG-CoA 
synthase, HMG-CoA reductase, and acetoacetyl-CoA thiolase is constructed as follows: 
Following restriction of pHMGR with Xhol-Kpnl, isolation of the 1.1 Kb DNA fragment 

30 by agarose gel electrophoresis, and purification by GeneClean, the 1.1 Kb Xhol-Kpnl 
DNA fragment containing the orf encoding the C-terminal portion of A. thaliana 
HMG-CoA reductase is inserted into the SaK-Kpnl sites of pBSHMGS by ligation to 
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create pBSHMGSR. Following restriction of pAACT with Sacll-Xbal, isolation of the 
1.2 Kb DNA fragment containing the orf encoding yeast acetoacetyl-CoA thiolase by 
agarose gel electrophoresis, and purification by GeneClean, the 1.2 Kb Sacll-Xbal DNA 
fragment is inserted into the Sacll-Xbal sites of pBSHMGSR by ligation to create 
5 pHKOl (Fig. 3). 

Example 6: Construction of pHK02 

In a specific, exemplified embodiment, a vector containing a synthetic operon 
consisting of six oris encoding polypeptides with acetoacetyl-CoA thiolase, HMG-CoA 

10 synthase, HMG-CoA reductase, mevalonate kinase, phosphomevalonate kinase, and 
mevalonate diphosphate decarboxylase enzymatic activities, thus comprising the entire 
mevalonate pathway, is constructed as follows: Restriction of pHKOl with Eagl yields 
a 3.7 Kb DNA fragment containing orfs encoding yeast acetoacetyl-CoA thiolase, A. 
thaliana HMG-CoA synthase, and a truncated A. thaliana HMG-CoA reductase. 

1 5 Following isolation of the 3 .7 Kb Eagl DNA fragment by agarose gel electrophoresis and 
purification by GeneClean, it is directionally inserted into the Notl site of pFC02 (Hahn 
et aL, 2001) utilizing the methodology of chain reaction cloning (Pachuk et al., 2000), 
thennostable Ampligase( (Epicentre Technologies, Madison, WI), and the following 
bridge oligonucleotide primers: 

20 1) 5' TGGAATTCGAGCTCCACCGCGGTGGCGGCCGCGTCGACGCCGGCGGAG 
GCACATATGTCT 3'(SEQ ID NO: 13); and 

2) 5 f AACAACAACAACATGACCCGGGATCCGGCCGCAGGAGGAGTTCATATG 
TCAGAGTTGAGA 3'(SEQ ID NO: 14); 

as follows: Agarose gel electrophoresis is performed on the 8.1 Kb pFC02/M?/I DNA 
25 fragment and the 3.7 Kb Eagl DNA fragment isolated from pHKOl to visually estimate 
their relative concentrations. Approximately equivalent amounts of each fragment 
totaling 4.5 [il, 1 nl of each bridge oligo at a concentration of 200 nM, 5 \x\ Axnpligase® 
10X Reaction Buffer (Epicentre), 3 \xl Ampligase® (5U/(1) (Epicentre), and 35.5 \il PCR 
grade H20 are added to a 0.5 ml PCR tube. The mixture is heated at 94° C for 2 min 
30 then subjected to 50 PCR cycles consisting of 30 sec at 94° C, 30 sec at 60° C, and 1 min 
at 66° C. After a final incubation at 66° C for 5 min, the reaction is cooled to 4° C. 
Colonies resulting from the transformation of R coli strain NovaBlue (Novagen) with 1 
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|il of the directional ligation reaction are grown in LB medium supplemented with 
ampicillin at a final concentration of 50 \xg/ml Restriction analysis with Nael-Kpnl of 
mini-prep plasmid DNA from the liquid cultures is performed to identify candidate 
pHK02 constructs by the presence of both a 5.7 and a 6.2 Kb DNA fragment. Further 
5 analysis by restriction with Smal-Xhol to generate both a 3 .9 and 7.9 Kb DNA fragment 
confirms the successful construction of pHK02 (Fig. 4). 

Example 7: Assay Demonstratin g the Synthesis of IPP from Acetyl-CoA in E. coli 

In a specific, exemplified embodiment, a derivative of pNGHl -amp (Hahn et al 9 

10 2001), containing the entire mevalonate pathway, is assayed (Fig. 5) for its ability to 
synthesize IPP from endogenous acetyl-CoA in E. coli strain FH11, containing the 
temperature sensitive dxs:±am T knockout (Hahn et al, 2001), as follows: Colonies 
resulting from the transformation of FH11, by pHK02, containing orfs encoding 
polypeptides with enzymatic activities for acetoacetyl-CoA thiolase, HMG-CoA 

15 synthase, HMG-CoA reductase, mevalonate kinase, phosphomevalonate kinase, and 
mevalonate diphosphate decarboxylase, are isolated by incubation at 30° C on LB plates 
containing Kan and Amp. Several 4 ml LB/Kan/amp samples are individually inoculated 
with single colonies from the FH1 l/pHK02 transformation. Following growth at 30° 
C overnight, the FH1 l/pHK02 cultures are diluted 100,000-fold, and 5 \il aliquots are 

20 spread on LB/Kan/amp plates at room temperature (rt) or that are prewarmed to 44° C. 
The prewarmed plates are incubated at 44° C, and the rt plates are incubated at 30° C 
overnight. FH1 1 and FH1 1/pNGHlamp cells will not grow at the restrictive temperature 
of 44° C (Hahn et al, 2001). FH1 l/pHK02 cells are able to grow at 44° C, thus 
establishing the ability, of a synthetic operon comprising the entire mevalonate pathway, 

25 to form EPP from acetyl-CoA and thereby overcome the dxs: :kan r block to MEP pathway 
biosynthesis of IPP in E. coli strain FH1 1 . 

Example 8: Construction of pHK03 

In another exemplified embodiment, a derivative of pBluescript(SK+) containing 
30 an operon comprising orfs, which in their summation is the entire mevalonate pathway, 
is constructed as follows: pHKOl, containing orfs encoding acetoacetyl-Co A thiolase, 
HMG-CoA synthase, and an N-terminal truncated HMG-CoA reductase, is restricted wife 
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Sall-NotI and purified by GeneClean. The pBluescript(SK+) derivative pFCOl, 
containing the orfs encoding mevalonate kinase, phosphomevalonate kinase, and 
mevalonate diphosphate decarboxylase, has been described above in Example 1. 
Following restriction of pFCOl with Xhol-Notl, isolation by agarose gel electrophoresis, 
5 and purification by GeneClean, the 3.9 Kb DNA fragment containing the mevalonate 
pathway orfs is inserted into pHKOVSall-Notl by directional ligation (Pachuk et al 9 
2000) utilizing thermostable Ampligase® (Epicentre Technologies, Madison, WI), and 
the following bridging oligonucleotides: 

1) 5' CTCAACTCTGACATATGAACTCCTCCTGCGGCCGCCGCGGTGGAGCTCC 
1 0 AGCTTTTGTTCCC 3' (SEQ ID NO: 1 5); and 

2) 5' GGTCTACCAAAGGAAGAGGAGTTTTAACTCGACGCGGGCGGAGGCACA 
TATGTCTCAGAACG 3' (SEQ ID NO: 1 6); 

essentially as described for the construction of pHK02. Restriction analysis is performed 
with Kpnl to confirm the successful construction of pHK03 (Fig. 6). 

15 

Example 9: Construction of Tobacco Plastid Transformation Vector pHK04 

In an exemplified embodiment, a vector containing a Nicotiana tabacum plastid 
pseudogene is utilized to create a plastid transformation vector as follows: The 
. <pBluescript(SK+) derivative designated as pBSNT27 (Fig. 7, SEQ ID NO: 17) contains 

20 a 3.3 Kb Bgni-BarriHL DNA fragment of the N. tabacum chloroplast genome 
corresponding approximately to base-pairs 80553-83810 of the published nucleotide 
sequence (Sugiura, M., 1986, and Tsudsuki, T., 1998.). A unique restriction site 
contained within the tobacco infA pseudogene located on pBSNT27 is cleaved with BgKI 
and the resulting 5' overhangs are filled in with Klenow and dNTPs. The resulting 6.2 

25 Kb blunt-ended DNA fragment is GeneClean purified. Following restriction of pHK03 
with Eagl, filling in of the resulting 5* overhangs with Klenow and dNTPs, isolation by 
agarose gel electrophoresis, and purification by GeneClean, the resulting 7.7 Kb 
blunt-ended DNA fragment, containing orfs encoding the entire mevalonate pathway, is 
directionally inserted into the blunt-ended Bglll site of pBSNT27 utilizing chain reaction 

30 cloning (Pachuk et al, 2000.), thermostable Ampligase® (Epicentre Technologies, 
Madison, WI), and the following bridging oligonucleotides: 

1) 5' GATCTTTCCTGAAACATAATTTATAATCAGATCGGCCGCAGGAGGAG 
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TTCATATGTCAGAGTTGAG 3' (SEQ ID NO: 18); and 

2) GACAACAACAACAACATGACCCGGGATCCGGCCGATCTAAACAAACCCG 
GAACAGACCGTTGGGAA 3' (SEQ ID NO: 19); 

to form the tobacco plastid-specific transformation vector pHK04 (Fig. 8). 
5 Alternatively, other derivatives of pBSNT27 can be constructed, using skills as 

known in the art, that are not reliant upon an available restriction site(s) in the 
pseudogene. For example, although the infA pseudogene comprises basepairs 3861-41 50 
in pBSNT27, there are unique restriction sites in close proximity, upsteam and 
downstream, that can be utilized to excise the entire pseudogene followed by its 

10 replacement with an orf or gene cluster comprising multiple orfs, e.g. the complete 
mevalonate pathway described above. Specifically, there is a unique BsrGI site at 3708 
base pairs and a unique SexAI restriction site at 4433 base pairs within pBSNT27. Thus, 
as will be readily apparent to those skilled in the art, one can replace the infA. pseudogene 
entirely by inserting a BsrGI- SexAI DNA fragment containing DNA, comprising orfs 

1 5 encoding the entire mevalonate pathway, that is flanked by the excised DNA originally 
flanking the infA pseudogene, i.e. DNA corresponding to 3708-3860 and 4151-4433 base 
pairs in pBSNT27. The resultant construct will be missing the pseudogene, but will 
contain the excised flanking DNA restored to its original position and now surrounding 
the mevalonate pathway orfs. Also, a similar strategy, that will also be apparent to those 

20 skilled in the art in view of this disclosure, can be employed that restores the intact 
pseudogene to a location between the DNA originally flanking it, yet linked to an orf or 
orfs located upstream and/or downstream of the pseudogene and adjacent to the original 
flanking DNA. 

25 Example 10: Construction of Vectors Containing Orfs Encoding IPP Isomerase (pHKQ5 
andpHKQ6) 

In a specific, exemplified embodiment, orfs encoding IPP isomerase are isolated 
and vectors containing an operon comprising orfs for the entire mevalonate pathway and 
an additional orf for IPP isomerase are constructed as follows: A Rhodobacter 
30 capsulatus orf encoding a polypeptide with IPP isomerase activity is isolated by PCR 
from genomic DNA (J. E. Hearst, Lawrence Berkeley Laboratories, Berkeley, CA) using 
the following primers: T 
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! 1) 5' CGCTCGi^TACGTAAGGAGGCACATATGAGTGAGCTTATACCCGCCTG 

I GGTTGG 3' (sense) (SEQ ID NO: 20); and 

i * 

2) 5' GCrC^G^^^CGGATCCGCGGCCGCTCAGCCGCGCAGGATCGATCC 

GAAAATCC 3' (antisense) (SEQ ID NO: 21); 
5 containing the restriction sites Xhol shown underlined, BsaAl shown in bold, Xbal shown 

in italic, EcdKSf shown double underlined, and Notl shown in bold italic. The PCR 

product is restricted with XhoVXbal, isolated by agarose gel electrophoresis, purified by 

GeneClean, and inserted into the Xhol-Xbal sites of pBluescript(SK+) by ligation to form 

pBSIDI. Sequence analysis is performed to identify the plasmids containing R 
10 capsulatus DNA identical to the complementary sequence of base pairs 34678-34148, 

located on contig rc04 (Rhodobacter Capsulapedia, University of Chicago, Chicago, IL). 

Following restriction of pBSIDI with BsaAI-EcoJZV, agarose gel electrophoresis and 

GeneClean purification, the 0.5 Kb BsaAI-EcoW DNA fragment containing the R. 

capsulatus orf is inserted into the dephosphorylated Smal site of pHK03 by blunt-end 
15 ligation to create pHKOS (Fig. 9). This establishes the isolation of a previously unknown 

and unique orf encoding R capsulatus IPP isomerase. 

A Schizosaccharomyces pombe orf encoding a polypeptide with IPP isomerase 

activity is isolated from plasmid pBSF19 (Hahn and Poulter, J. Biol. Chem. 

270:11298-11303, 1995) by PCR using the following primers 
20 3) 5' GCTCTAGATACGTAGGAGGCACATATGAGTTCCCAACAAGAGAAAAA 

GGATTATGATGAAGAACAATTAAGG 3 f (sense) (SEQ ID NO: 22); and 

4) 5' CGCTCGAG£C£GQfiGGATCCTTAGCAACGATGAATTAAGGTATC 

AATTTTGACGC 3' (antisense) (SEQ ID NO: 23); 

containing the restriction site BsaAI shown in bold and the restriction site Smal shown 
25 double underlined. The 0.7 Kb PCR product is isolated by agarose gel electrophoresis, 
purified by GeneClean and inserted into the pT7Blue-3 vector (Novagen, Inc., Madison, 
WI) using the Perfectly Blunt™ Cloning Kit (Novagen) according to the manufacturer's 
instructions. Sequence analysis is performed to identify constructs containing S. pombe 
DNA identical to the published DNA sequence (Hahn and Poulter, 1995) and are 
30 designated pIDL Following restriction of pIDI with BsaAI-Smal, isolation by agarose gel 
electrophoresis, and purification by GeneClean, the 0.7 Kb BsaAI-Smal DNA fragment 
containing the orf encoding S. pombe IPP isomerase is inserted into the dephosphorylated 
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Smal site of pHK03 by blunt-end ligation to create pHK06. 

Example 11: Construction of Vectors Containing Alternative Oris for Mevalonate 
Pathway Enzymes and IPP Isomerase 
5 In another exemplified embodiment, vectors containing open reading frames 

(orfs) encoding enzymes of the mevalonate pathway and IPP isomerase other than those 
described above are constructed. Polynucleotides derived from the yeast Saccharomyces 
cerevisiae, the plant Arabidopsis thaliana, and the bacteria Rhodobacter capsulatus and 
Streptomyces sp strain CL190 are used for the construction of vectors, including plastid 
1 0 delivery vehicles, containing orfs for biosynthesis of the encoded enzymes. Construction 
of the vectors is not limited to the methods described. One skilled in the art may choose 
alternative restriction sites, PCR primers, etc. to create analogous plasmids containing the 
same orfs or other orfs encoding the enzymes of the mevalonate pathway and IPP 
isomerase. 

1 5 Specifically, by way of example, genomic DNA is isolated from Streptomyces sp 

strain CL190 (American Type Culture Collection, Manassas, VA) using the DNeasy 
Tissue Kit (Qiagen) according to the manufacturer's instructions. An orf encoding a 
polypeptide with HMG-CoA reductase activity (Takahashi et al, J. Bacteriol. 
181:1256-1263, 1999) is isolated from the Streptomyces DNA by PCR using the 

20 following primers : 

1) 5' CCG CTCGAG CACGTGAGGAGGCACATATGACGGAAACGCACGCCATAG 

CCGGGGTCCCGATGAGG 3' (sense) (SEQIDNO: 24); and 

2) 5' GGGG2^CCGCGGCCGCACGCOTCTATGCACCAACCTTTGCGGTCTT 
GTTGTCGCGTTCCAGCTGG 3' (antisense) (SEQ ID NO: 25); 

25 containing the restriction site Xhol shown underlined, the restriction site Kpnl shown in 
italics, the restriction site Notl shown in bold, and the restriction site Mlul shown double 
underlined. The 1.1 Kb PCR product is isolated by agarose gel electrophoresis, purified 
by GeneClean and inserted into the pT7Blue-3 vector (Novagen, Inc., Madison, WI) 
using the Perfectly Blunt™ Cloning Kit (Novagen) according to the manufacturer's 

30 . instructions. Sequence analysis is performed to identify constructs containing 
Streptomyces sp CL190 DNA identical to the published sequence and are designated 
pHMGR2. 
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Alternatively, using skills as known in the art, an orf encoding a truncated S. 
cerevisiae HMG-CoA reductase (Chappel et al., US patent 5,349,126 1994) can be 
isolated by PCR and inserted into pT7Blue-3 (Novagen, Inc., Madison, WI) to construct 
a vector for use in building a gene cluster comprising the entire mevalonate pathway, in 
5 an analgous fashion to the use of the Streptomyces sp CL190 orf encoding HMG-CoA 
reductase, as described herein. 

Following restriction of pAACT (see Example 4) with SacR-Xbal, isolation of the 
1.2 Kb DNA fragment containing the orf encoding yeast acetoacetyl-CoA thiolase by 
agarose gel electrophoresis, and purification by GeneClean, the 1 .2 Kb SacTL-Xbal DNA 
10 fragment is inserted into the Sacll-Xbal sites of pBSHMGS (see Example 4) by ligation 
to create pBSCTGS. Following restriction of pHMGR2 \nihXIioI-Kpnl, isolation of the 
1 . 1 Kb DNA fragment by agarose gel electrophoresis, and purification by GeneClean, the 
1.1 Kb Xhol-Kpnl DNA fragment containing the orf encoding Sfreptomyces sp CL190 
HMG-CoA reductase is inserted into the Xhol-Kpnl sites of pBSCTGS by ligation to 
15 create the pBluescript(SK+) derivative, pFHOl (Fig. 10). 

A derivative of pFHOl containing an operon with orfs, which in their summation 
comprise the entire mevalonate pathway, is constructed as follows: pFHOl is restricted 
with SnaBl and the resulting 6.6 Kb blunt-ended DNA fragment is purified by 
GeneClean. Following the restriction of pFCOl (see Example 1) with Notl-Xhol, the 
20 resulting 3.9 Kb DNA fragment is isolated by agarose gel electrophoresis and purified by 
GeneClean. The 5' overhangs of the 3.9 Kb DNA fragment are filled in with Klenow and 
dNTPs. Following purification by GeneClean, the blunt-ended DNA fragment containing 
three mevalonate pathway orfs (Hahn et al 9 2001) is inserted into the SnaBl site of 
pFHOl utilizing directional ligation methodology (Pachuk et al, 2000), thermostable 
25 Ampligase® (Epicentre Technologies, Madison, WI), and the bridging oligonucleotides: 

3) 5' GAGCTCCACCGCGGCGGCCGCGTCGACTACGGCCGCAGGAGGAGTTCA 
TATGTCAGAGTT 3' (SEQ ID NO: 26); and 

4) 5' TCTACCAAAGGAAGAGGAGTTTTAAC 
TCAGAACGTTTA 3' (SEQ ID NO: 27); 

30 toformpFH02(Fig. 11). 

A derivative of pFH02 containing an operon with orfs, which in their summation 
comprise the entire mevalonate pathway and an orf encoding IPP isomerase is constructed 
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as follows: pFH02 is restricted with Mwl and the resulting 5' overhangs are filled in with 
Klenow and dNTPs. The 10.6 Kb blunt-ended DNA fragment is purified by GeneClean. 
Following restriction of pBSIDI with 2fcaAI-ita>RV, agarose gel electrophoresis and 
GeneClean purification, the resulting blunt-ended 0.5 Kb DNA fragment containing the 
5 R. capsulatus IPP isomerase orf is inserted into the filled in Mwl site of pFH02 utilizing 
directional ligation methodology (Pachuk et aL, 2000), thermostable Ampligase® 
(Epicentre Technologies, Madison, WI), and the following bridging oligonucleotides: 
5) 5' CAAGACCGCAAAGGTTGGTGCATAGACGCGGTAAGGAGGCACATATGA 
GTGAGCTTATAC 3' (SEQ ID NO: 28); and 

10 6) 5 r CCTGCGCGGCTGAGCGGCCGCGGATCCGATCGCGTGCGGCCGCGGTACC 
CAATTCGCCCT 3'(SEQ ID NO: 29); 
toformpFH03(Fig. 12). 

Following the restriction of pBluescript(SK+) with SacII-JK>aI and purification 
by GeneClean, a 13 Kb Sacll-Xbal DNA fragment containing the orf encoding S. 

1 5 cerevisiae acetoacetyl-CoA thiolase, isolated from pAACT (see Example 4) by restriction 
and agarose gel electrophoresis, is inserted into pBluescrip^SK+ySafcII-^&al by ligation. 
The resulting plasmid, pBSAACT, is restricted with Xbal, treated with Klenow and 
dNTPs, and purified by GeneClean. Following restriction of Streptomyces sp CL190 
genomic DNA with SnaBl, a blunt-ended 6.8 Kb DNA fragment, containing five (5) orfs 

20 encoding polypeptides with HMG-CoA synthase, HMG-CoA reductase, mevalonate 
kinase, phosphomevalonate kinase, mevalonate diphosphate decarboxylase and IPP 
isomerase enzymatic activities (Takagi et aL, J. Bacteriol. 182:4153-4157, 2000 and 
Kuzuyama et aL, Proc. Natl. Acad. Sci. USA 98:932-7, 2001), is isolated by agarose gel 
electrophoresis, purified by GeneClean and inserted into the filled in Xbal site of 

25 pBSAACT utilizing directional ligation methodology (Pachuk et aL , 2000), thermostable 
Ampligase® (Epicentre Technologies, Madison, WI), and the bridging oligonucleotides: 

7) 5' TGTCATTGAAAAGATATGAGGATCCTCTAGGTACTTCCCTGGCGTGTGC 
AGCGGTTGACG 3' (SEQ ID NO: 30); and 

8) 5' CGATTCCGCATTATCGGTACGGGTGCC 
30 CGGGCTGCAGG3'(SEQIDNO:31); 

to form pFH04 (Fig 13). Transformation experiments to isolate pFH04 constructs are 
performed with E. coli competent cells utilizing media containing ampicillin. 
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Alternatively, media containing only fosmidomycin (20 fig/ml) as the selection agent is 
used for the direct isolation of pFH04 constructs containing the Sfreptomyces sp CLl 90 
gene cluster. 

The construction of vectors pHK02, pHK03, pHK05, pHK06, pFH02, pFH03, and 
5 pFH04, illustrates the many ways of combining orfs isolated from a variety of organisms 
to encode polypeptides such that in their summation they comprise the entire mevalonate 
pathway or comprise the entire mevalonate pathway and IPP isomerase. 

Example 12: Construction of Tobacco Plastid Transformation Vectors pHKQ7 and 
10 pHKQ8 " 

In a specific, exemplified embodiment, tobacco plastid-specific transformation 
vectors containing orfs, which in their summation comprise the mevalonate pathway, and 
an additional orf encoding IPP isomerase are constructed as follows: Restriction of 
pHKOS with Notl generates a DNA fragment containing six orfs comprising the entire 

15 mevalonate pathway and an additional orf encoding R. capsulatas IPP isomerase. 
Restriction of pHK06 with Eagl generates a DNA fragment containing the six orfs 
comprising the complete mevalonate pathway and an additional orf encoding S. pombe 
IPP isomerase. Following isolation by agarose gel electrophoresis and purification by 
GeneClean, the 8.2 Kb Notl DNA fragment from pHK05 is blunt-ended with Klenow 

20 and dNTPs and inserted into the blunt-ended Bglil site of pBSNT27 utilizing chain 
reaction cloning (Pachuk et al., 2000), thermostable Ampligase® (Epicentre 
Technologies, Madison, WI), and the following bridging oligonucleotides: 
1) 5' CTTTCCTGAAACATAATTTA^ 
TATGTCAGAGTT 3' (SEQ ID NO: 32); and 

25 2) 5'TTCGGATCGATCCTGCGCGGCTGAGCGGCCGATCTAAACAAACCCGGA 
ACAGACCGTTGG 3' (SEQ ID NO: 33); 

to create the plastid delivery vehicle pHK07 (Fig. 14) containing orfs encoding the entire 
mevalonate pathway and an orf encoding R. capsidatus IPP isomerase. Following 
isolation by agarose gel electrophoresis and purification by GeneClean, the 8.4 Kb Eagl 
30 DNA fragment from pHK06 is blunt-ended with Klenow and dNTPs and inserted into 
the blunt-ended BgUl site of pBSNT27 utilizing chain reaction cloning (Pachuk et al 9 
2000), thermostable Ampligase® (Epicentre Technologies, Madison, WI), and the 
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following bridging oligonucleotides: 

3) 5 ? CTTTCCTGAAACATAATTTATAATCAGATCGGCCGCAGGAGGAGTTCA 
TATGTCAGAGT 3' (SEQ ID NO: 34); and 

4) 5' TCGTTGCTAAGGATCCCCCGGGATCCGGCCGATCTAAACAAACCCGGA 
5 ACAGACCGTTGG 3' (SEQ ID NO: 35); 

to create the plastid delivery vehicle pHK08 containing orfs encoding the entire 
mevalonate pathway plus the 5. pombe IPP isomerase orf. 

Alternatively, either of the IPP isomerase orfs described above can be solely 
inserted, without orfs for the mevalonate pathway, directly into pBSNT27 (or into any 
1 0 suitable plant transformation vector, known in the art), using skills known in the art. 

Example 13: Construction of Vectors used for Increasing Carotenoid Production 
(pHKQ9. pHKlO, pHKlh pHK12. and pHKD^ 

In yet another exemplified embodiment, a derivative of pTrcHisB (Invitrogen) 
15 containing a synthetic operon comprising orfs, which in their summation is the entire 
mevalonate pathway, is constructed as follows: A unique Notl site was inserted into 
pTrcHisB utilizing the following oligonucleotides: 

1) 5' CATGGCGGCCGCG 3' (SEQ ID NO: 36); and 

2) 5' GATCCGCGGCCGC 3' (SEQ ID NO: 37); 

20 that upon annealing, form a double-stranded DNA linker containing Notl with 5' 
overhangs compatible with Styl and BamHL. Following restriction of pTrcHisB with 
Styl-BamHi, isolation of the resulting 4.3 Kb DNA fragment by agarose gel 
electrophoresis, and its purification by GeneClean, the Notl linker was inserted into 
pTrcHisB/Styl-BamlU by ligation. Restriction analysis with BsaAI-Notl confirms the 

25 successful construction of pTrcHisB-M?/I (pTHBNl) by the presence of both 2.5 and 1 .8 
Kb DNA fragments. Following restriction of pHK03 with EagI, the 7.7 Kb DNA 
fragment, containing the six mevalonate pathway orfs, is isolated by agarose gel 
electrophoresis, purified by GeneClean, and inserted into the Notl site of pTHBNl 
utilizing directional ligation methodology (Pachuk et al, 2000), thermostable 

30 Ampligase® (Epicentre Technologies, Madison, WI), and the bridging oligonucleotides: 

3) 5 ? TTAAATAAGGAGGAATAAACCATG 
GTCAGAGTTGAGA 3' (SEQ ID NO: 38); and 
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4) 5 v AACAACAACAACATGACCCGGGATCCGGCCGCGATCCGAGCTCGAGA 
TCTGCAGCTGGTA 3* (SEQ ID NO: 39); 

toformpHK09(Fig. 15). 

Derivatives of pTHBNl containing the entire mevalonate pathway plus an 
5 additional orf encoding IPP isomerase are constructed as follows: Following restriction 
of pHKOS with Noil, the 8.2 Kb DNA fragment, containing the six mevalonate pathway 
orfs plus an orf encoding R. capsulatus IPP isomerase, is isolated by agarose gel 
electrophoresis, purified by GeneClean, and inserted into the Notl site of pTHBNl 
utilizing directional ligation methodology (Pachuk et al, 2000), thermostable 
1 0 Ampligase® (Epicentre Technologies, Madison, WI), and the bridging oligonucleotides: 

5) 5' TCGATTAAATAAGGAGGAATAAACCATGGCGGCCGCAGGAGGAGTTCA 
TATGTCAGAGTT 3' (SEQ ID NO: 40); and 

6) 5' GATTTTCGGATCGATCCTGCGCGGCTGAGCGGCCGCGATCCGAGCTCG 
AGATCTGCAGCT 3' (SEQ ID NO: 41); 

15 to form pHKlO (Fig. 16). Following restriction of pHK06 with Eagl, the 8.4 Kb DNA 
fragment, containing the six mevalonate pathway orfs plus an orf encoding S. pombe IPP 
isomerase, is isolated by agarose gel electrophoresis, purified by GeneClean, and inserted 
into the Notl site of pTHBNl utilizing directional ligation methodology (Pachuk et al s 
2000), thermostable Ampligase® (Epicentre Technologies, Madison, WI), and the 

20 following bridging oligonucleotides: 

7) 5' TCGATTAAATAAGGAGGAATAAACCATGGCGGCCGCAGGAGGAGTTCA 
TATGTCAGAGTT 3' (SEQ ID NO: 42); and 

8) 5' TTCATCGTTGCTAAGGATCCCCCGGGATCCGGCCGCGATCCGAGCTCG 
AGATCTGCAGCT 3' (SEQ ID NO: 43); 

25 toformpHKll. 

Derivatives of pTHBNl containing only an orf encoding IPP isomerase are 
constructed as follows: pTHBNl is restricted with Notl and the resulting 5' overhangs 
are filled in with Klenow and dNTPs. The 4.3 Kb pTHBNl/JVofl blunt-ended DNA 
fragment is GeneClean purified. Following restriction of pBSIDI with BsaAI-EcoKSf, 

30 agarose gel electrophoresis and GeneClean purification, the resulting blunt-ended 0.5 Kb 
DNA fragment containing the R. capsulatus IPP isomerase orf is inserted into the filled 

in Notl site of pTHBNl utilizing chain reaction cloning (Pachuk et aL, 2000), 

•" ■ * 
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thermostable Ampligase® (Epicentre Technologies, Madison, WI), and the following 
bridging oligonucleotides: 

9) 5' TTAAATAAGGAGGAATAAACCATGGCGGCCGTAAGGAGGCACATATC 
AGTGAGCTTATAC T 3' (SEQ ID NO: 44); and 
5 10) 5 f GCCTGCGCGGCTGAGCGGCCGCGGATCCGATGGCCGCGATCCGAGCTC 
G AGATCTGC AGCT 3 1 (SEQ ID NO: 45); 

to form pHK12. Following restriction of pIDI with BsaAl-Smal, agarose gel 
electrophoresis and GeneClean purification, the resulting blunt-ended 0.7 Kb DNA 
fragment containing the S. pombe IPP isomerase orf is inserted into the filled in Notl site 
10 of pTHBNl utilizing chain reaction cloning (Pachuk et al, 2000), thermostable 
Ampligase® (Epicentre Technologies, Madison, WI), and Hie bridging oligonucleotides: 

11) 5' TTAAATAAGGAGGAATAAACCATGGCGGCCGTAGGAGGCACATATGA 
GTTCCCAACAAGA 3' (SEQ ID NO: 46); and 

12) 5' ACCTTAATTCATCGTTGCTAAGGATCCCCCGGCCGCGATCCGAGCTCG 
1 5 AGATCTGCAGCT 3' (SEQ ID NO: 47); 

toformpHK13. 

Example 14: Increased Isoprenoid Production in Cells Containing the MEP Pathway 
In another exemplified embodiment, a carotenoid producing K colt strain is 

20 utilized to demonstrate the effect of the insertion of orfs encoding the entire mevalonate 
pathway, or orfs encoding the entire mevalonate pathway and IPP isomerase, or an orf 
encoding just IPP isomerase, on production of lycopene as follows: Following the 
transformation of K colt TOP 10 F (Invitrogen) with pAC-LYC (Cunningham et at, J. 
Bacteriol. 182:5841-5848, 2000), transformed cells are isolated on LB/Cam (30 fig/ml) 

25 plates grown at 30° C. TOPI 0 F/pAC-LYC competent cells are prepared by the CaCl 2 
method (Sambrook et aL, 1989) following growth in LB/Cam in darkness at 28° C and 
225 rpm to an optical density (A^) of 0.6. Competent TOP10 F/pAC-LYC cells are 
transformed with one of the following plasmids: pTrcHisB; pHK09, a pTrcHisB 
derivative containing the entire mevalonate pathway; pHKlO, a pTrcHisB derivative 

30 containing the entire mevalonate pathway plus the orf encoding R. capsulatus IPP 
isomerase; pHKl 1, a pTrcHisB derivative containing the entire mevalonate pathway plus 
the orf encoding S. pombe IPP isomerase; pHK12, a pTrcHisB derivative containing the 
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orf encoding R. capsulatus IPP isomerase; and pHK13, a pTrcHisB derivative containing 
the orf encoding S. pombe IPP isomerase. The bacterial strains described above, 
comprising pTHBNl derivatives containing the mevalonate pathway orfs and/or an orf 
encoding IPP isomerase, are designated HK1, HK2, HK3, HK4, and HK5 respectively. 

5 The resulting transformants are isolated as colonies from LB/Cam/amp plates grown at 
30° C. Single colonies of TOP10 FVpAC-LYC/pTrcHisB and HK1 (TOP10 
F7pAC-LYC/pHK09) are used to individually inoculate 4 ml LB/Cam/amp cultures and 
grown overnight in the dark at 28° C and 225 rpm. The cultures are serially diluted 
10,000 to 100,000-fold, plated on LB/Cam/amp medium containing IPTG, and grown in 

0 the dark at it for 2 to 10 days. The plates are visually examined for an increase in 
lycopene production as evident by a "darkening" of the light pink colored colonies that 
are present on the control plates corresponding to TOP10 FVpAC-LYC/pTrcHisB. The 
same experiments are performed with strains HK2, HK3, HK4 , and HK5 to determine, 
visually, the effect of the orfs contained within pHKlO, pHKll, pHK12, and pHK13 on 

5 lycopene production in TOP10 FVpAC-LYC cells. The quantification of the carotenoid 
lycopene in cells, identified as potential overproduces due to their darker color when 
compared to the color of TOP10 FVpAC-LYC/pTHBNl cells, is performed utilizing a 
spectrophotometric assay as described by Cunningham et al (Cunningham et al, 2000). 
Increased production of lycopene in E. coli cells containing the entire mevalonate 

3 pathway or the entire mevalonate pathway plus an additional drf for IPP isomerase 
establishes that the presence in cells of an additional biosynthetic pathway for the 
formation of IPP or IPP and DMAPP enhances the production of isoprenoid compounds, 
such as carotenoids, that are derived from IPP and DMAPP. 

5 Example 15: Demonstration of Antibiotic Resistance Due to the Mevalonate Pathway 
in MEP Pathway Dependent Cells 

In still another exemplified embodiment, K coli cells are transformed with DNA 
containing orfs, which in their summation comprise the entire mevalonate pathway, and 
the resulting cells are tested for resistance to the antibiotic fosmidomycin as follows: 

) Following the separate transformation of E. coli TOP 10 F (Invitrogen) with pHK02, 
pHK03 and pHK09, transformed cells are isolated on LB/Amp (50 ]xg/nA) plates grown 
at 30° C. Single colonies of TOP10 FVpHK02 (designated strain HK6), TOP10 
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FVpHK03 (designated strain HK7), and TOP10 F/pHK09 (designated strain HK8), are 
used to individually inoculate 4 ml LB/amp cultures and grown overnight at 30° C, 225 
rpm. The HK6 and HK7 cultures are serially diluted 10,000 to 100,000-fold and plated 
on LB containing fosmidomycin (20 fig/ml). The HK8 cultures are serially diluted 
5 10,000 to 100,000-fold and plated on LB/ IPTG containing fosmidomycin (20 |ig/ml) 
Controls are performed with cells comprising TOP 10 F transformed with the parent 
vectors of pHK02, pHK03 and pHK09, by plating on the appropriate medium 
containing fosmidomycin establishing that E. coli control cells are unable to grow on 
medium containing fosmidomycin. The ability of transformed E. coli cells to grow in the 
10 . presence of the antibiotic fosmidomycin establishes that the inserted DNA, comprising 
the entire mevalonate pathway and thus an alternative biosynthetic route to IPP, is 
functional and can circumvent the inhibition of an enzyme in the trunk line of the MEP 
pathway. 

15 Example 16: Construction of Plastid Transformation Vectors 

In a specific, exemplified embodiment, a plant plastid transformation vector 
containing a synthetic operon comprising orfs, which in their summation is the entire 
mevalonate pathway, is constructed as follows: Plasmid pHK03, a pBluescript 
derivative containing all six mevalonate pathway orfs, is assembled by restriction of 

20 pFCO 1 to yield a 3 .9 Kb Notl-Xhol DNA fragments containing three mevalonate orfs and 
its subsequent insertion into the Sall-NotI sites of pHKOl by directional ligation as 
described above in Example 8. The plastid transformation vehicle, pHK14 containing 
the entire mevalonate pathway is constructed as follows: Plastid vector pGS 1 04 (Serino 
and Maliga, Plant J. 12:687-701, 1997) is restricted with Ncol-Xbal and the two resulting 

25 DNA fragment are separated by agarose gel electrophoresis. Following isolation of the 
larger DNA fragment by gel excision and its purification by GeneClean, the NcoItAM 
5' overhangs are dephosphorylated using SAP and filled in with Klenow and dNTPs. The 
resulting blunt-ended, dephosphorylated DNA fragment derived from pGS104 is 
GeneClean purified. Following restriction of pHK03 with EagI, isolation by agarose gel 

30 electrophoresis, and purification by GeneClean, the 7.7 Kb DNA fragment is treated with 
Klenow and dNTPs to fill in the 5' overhangs. The resulting blunt-ended DNA fragment 
containing the mevalonate pathway is purified by GeneClean and inserted into the 
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dephosphorylated, Klenow-treated Ncol-Xbal sites of pGSl 04 by blunt-end ligation to 
yield pHK14. 

Derivatives of pGS104 containing the entire mevalonate pathway plus an 
additional orf encoding IPP isomerase are constructed as follows: Following restriction 
5 of pHK05 with Notl and treatment with Klenow and dNTPs, the resulting 8.2 Kb 
blunt-ended DNA fragment, containing the six mevalonate pathway orfs plus an orf 
encoding R. capsulatus IPP isomerase, is isolated by agarose gel electrophoresis, purified 
by GeneClean, and inserted into the dephosphorylated, filled in Mrol-Xbal sites of 
pGS104 by blunt-end ligation to yield pHK15. Following restriction of pHK06 with 
10 EagI and treatment with Klenow and dNTPs, the resulting 8.4 Kb blunt-ended DNA 
fragment, containing the six mevalonate pathway orfs plus an orf encoding S. pombe IPP 
isomerase, is isolated by agarose gel electrophoresis, purified by GeneClean, and inserted 
into the dephosphorylated, filled in NcoI-AM sites of pGS104 by blunt-end ligation to 
yieldpHK16. 

15 Derivatives of pGS104 containing only an orf encoding IPP isomerase are 

constructed as follows: Following restriction of pBSIDI with BsaAI-EcoBV, agarose gel 
electrophoresis and GeneClean purification, the resulting blunt-ended 0.5 Kb DNA 
fragment containing the R. capsulatus IPP isomerase orf is inserted into the 
dephosphorylated, filled in Ncol-Xbal sites of pGS104 by blunt-end ligation to yield 

20 pHKl 7. Following restriction of pIDI with JBsaAI-Smal, agarose gel electrophoresis and 
GeneClean purification, the resulting blunt-ended 0.7 Kb DNA fragment containing the 
S. pombe IPP isomerase orf is inserted into the dephosphorylated, filled in Ncol-Xbal 
sites of pGS 1 04 by blunt-end ligation to yield pHKl 8. 

25 Example 17: Construction of Transplastomic Plants Containing Orfs Encoding the 
Mevalonate Pathway or Orfs Encoding the Mevalonate Pathway Coupled with IPP 
Isomerase 

In another exemplified embodiment, tobacco is engineered at the plastid level by 
using any of the plastid transformation vectors described above, or their equivalents, such 
30 as variants of those plastid transformation vectors as can be routinely constructed by 
means known in the art and containing the orfs as taught and described above. 
Specifically, Nicotiana tabacum var. 'Xanthi NC leaf sections (1 x 0. 5 cm strips from 
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in vitro plants with 3 to 5 cm long leaves) are centered in the dish, top side up and 
bombarded with 1 |im gold micro particles (Kota et al, 1999) coated with DNA 
containing orfs, which in their summation comprise the entire mevalonate pathway, using 
a PDS 1000 He device, at 1 100 psi. Toxicity is evident in tobacco after three weeks of 
5 growth on medium containing the antibiotic fosmidomycin at a concentration of at least 
500 micromolar. Transplastomic plants are recovered from leaf sections cultured under 
lights on standard RMOP shoot regeneration medium or on a Murashige-Skoog salts 
shoot regeneration medium with 3% sucrose, Gamborg's B5 vitamins, 2 mg/L 
6-benzylamino-purine and Phytagel (2.7 g/L), containing 500 \xM fosmidomycin for the 

1 0 direct selection of insertion of the entire mevalonate pathway into plastids. Alternatively, 
the regeneration medium contains ah antibiotic, e.g. spectinomycin, for selection based 
on antibiotic resistance due to any co-transformed gene on the transforming DNA vector, 
as would be readily apparent to the skilled artisan. De novo green leaf tissue is visible 
after three weeks. Tissue is removed to undergo a second round of selection on shoot 

1 5 regeneration medium with 500 ^iM fosmidomycin to encourage homoplasmy and plants 
are rooted. Genomic DNA is isolated from TO leaf tissue or Tl leaf tissue derived from 
in vitro germinated transplastomic seeds utilizing the DNeasy Plant Mini Kit (Qiagen Inc, 
Valencia, CA) according to the manufacturer's instructions and is subjected to analysis 
as is known in the art to confirm homoplasmy. The ability to select directly for a 

20 transformation event corresponding to the successful insertion of the mevalonate pathway 
orfs into plastids establishes the use of orfs, which in their summation comprise the entire 
mevalonate pathway, as a selectable marker for plastid transformation. The construction 
of fosmidomycin resistant plants establishes the ability of the mevalonate pathway, when 
functioning in plant plastids, to provide an alternate biosynthetic route to IPP, thus 

25 overcoming the effect of an inhibitor targeting an enzyme in the trunk line of the MEP 
pathway. 

Example 18: Metabolic engineering in transplastomic Solanaceae plants 

In another exemplified embodiment, Solanaceae species are engineered at the 
30 plastid level using infA pseu4ogene insertion of a selectable marker and orfs for 
. expression. Specifically, leaf sections of a genetically defined white petunia (or other 
. petunia), are engineered, as for the Solanaceous species tobacco (see Example 16), using 
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vectors pHK04 or pHK07, or their equivalents, for insertion of orfs encoding the entire 
mevalonate pathway or orfs encoding the entire mevalonate pathway and IPP isomerase. 
Transplastomic Solanaceae plants containing orfs encoding the entire mevalonate 
pathway and IPP isomerase, and containing an additional orf encoding phytoene synthase, 
5 are created by insertion of a pBSNT27 (see Example 9) derived vector, constructed as 
follows: 

A Rhodobacter capsulatus orf encoding a polypeptide with phytoene synthase 
activity is isolated by PCR from genomic DNA using the primers 

1) 5' GCGATATCGGATCCAGGAGGACCATATGATCGCCGAAGCGGATATGGA 
10 GGTCTGC 3' (sense) (SEQ ID NO: 65) 

2) 5' GCGATATCAAGCTTGGATCCTCAATCCATCGCCAGGCCGCGGTCGCGC 
GC 3' (antisense) (SEQ ID NO: 66) 

containing the restriction site BamHI shown underlined. The 1.1 Kb PCR product is 
isolated by agarose gel electrophoresis, purified by GeneClean and inserted into the 

1 5 pT7Blue-3 vector (Novagen) using the Perfectly Blunt( Cloning Kit (Novagen) according 
to the manufacturer's instructions. Sequence analysis is performed to identify constructs 
containing R. capsulatus DNA identical to the published DNA sequence (SEQ ID NO: 
71) and are designated pPHS. Following restriction of pPHS with BamHI, isolation by 
agarose gel electrophoresis, and purification by GeneClean, the 1.1 Kb BamHI DNA 

20 fragment containing the orf encoding R. capsulatus phytoene synthase is inserted into the 
BgUI site of pBSNT27 utilizing chain reaction cloning (Pachuk et al, 2000), thermostable 
Ampligase( (Epicentre Technologies, Madison, WI), and the bridging oligonucleotides 

3) 5' CTTTCCTGAAACATAATTTATAATCAGATCCAGGAGGACCATATGA 
TCGCCGAAGCGGAT 3' (SEQ ID NO: 67); and 

25 4) 5' CGACCGCGGCCTGGCGATGGATTGAGGATCTAAACAAACCCGGAA 
CAGACCGTTGGGAAG 3' (SEQ ID NO: 68); 

to create plastid transformation vector pFHOS. Following restriction of pFH05 with 
Xcml, a unique site in the infA pseudogene, and purification by GeneClean, the resulting 
3' overhangs are removed by treatment with Mung Bean nuclease and the resulting 
' 30 blunt-ended DNA fragment is purified by GeneClean. Vector pFH03 is restricted with 
NotI and the resulting 8.3 Kb DNA fragment, containing Operon E, is isolated by agarose 
gel electrophoresis and purified by GeneClean. The 5' overhangs of the isolated DNA 
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fragment are filled in with Klenow and dNTPs and the resulting blunt end DNA 
fragment, containing Operon E, is inserted into die Mung Bean nuclease treated Xcxnl site 
of pFHOS utilizing chain reaction cloning (Pachuk et al, 2000), thermostable Ampligase( 
(Epicentre Technologies, Madison, WI), and the bridging oligonucleotides 
5 5) 5' ATTTTTCATCTCGAATTGTATTCCCACGAAGGCCGCGTCGACTACG 
GCCGCAGGAGGAGT3' (SEQ ID NO: 69); and 

6) 5' TTCGGATCGATCCTGCGCGGCTGAGCGGCCGGAATGGTGAAGTTG 

AAAAACGAATCCTTC3 1 (SEQ ID NO: 70); 

to create the plastid transformation vector pFH06 (Fig. 17). 

10 Alternatively, an orf encoding IPP isomerase can be inserted into the Xcml site 

of pFHOS, utilizing skills as known in the art, to create a plastid transformation vector 
containing both an orf encoding phytoene synthase and an orf encoding IPP isomerase. 
Another alternative uses the infA pseudogene as an insertion site for orfs, encoding 
phytoene synthase, and/or IPP isomerase, and/or the entire mevalonate pathway, linked 

1 5 with the aadA gene as is known in the art for selection of transplastomic plastids on 500 
microgram per liter spectinomycin. 

The BioRad PDS 1000 He gene gun is used to deliver BioRad tungsten M10 (0.7 
micron approx.) microspheres into petunia (Petunia hybrida 'Mitchell 1 ) leaves positioned 
top-side up. Intact leaves, or equivalent tissues of about 6-8 cm 2 per sample are plated 

20 onto shoot regeneration medium consisting of Murashige and Skoog basal medium, B5 
vitamins, 3% sucrose, 0.7% (w/v) agar and 3 mg/l BA (6-benzylamino-purine), 0.1 mg/1 
IAA (Deroles and Gardner, Plant Molec. Biol. 1 1 : 355-364, 1988) in 100 x 10 mm plastic 
Petri dishes. Leaves are centered in the target zone of the gene gun for bombardment at 
1100 psi, third shelf from bottom, - 5.6 cm gap, 28 mgHg vacuum. M10 microspheres 

25 are coated with DNA using standard procedures of CaCl 2 and spermidine precipitation, 
1.5 to 2 ug DNA/bombardment. After bombardment, tissues are cultured in light in the 
presence of antibiotic (500 micromolar fosmidomycin). Each leaf sample is then cut into 
about 6 pieces and cultured on petunia shooting medium containing 500 micromolar 
fosmidomycin for 3 to 8 weeks, with subculture onto fresh medium every three weeks. 

30 Any green shoots are removed and leaves plated onto the same medium containing 500 
micromolar fosmidomycin. Plantlets with at least four leaves and of solid green color (no 
bleaching on petioles or whorls) are transferred for rooting onto solidified hormone-free 
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Murashige and Skoog salts with B5 vitamins and 2% sucrose and are grown to flowering. 
The dependency of increased carotenoid production in Solanacae on the combination of 
the orfs inserted, be it an orf encoding phytoene synthase alone; or orfs encoding the 
entire mevalonate pathway and phytoene synthase; or orfs encoding phytoene synthase, 
5 the entire mevalonate pathway and IPP isomerase; or orfs for phytoene synthase and IPP 
isomerase, establishes that the addition of the mevalonate pathway and/or IPP isomerase 
to plant plastids enhances the production of isoprenoid compounds that are derived from 
IPP and DMAPP; and the suitability of a pseudogene insertion site for creating 
transplastomic Petunia. 

10 

Example 19: Transformation of microalgae 

In a specific exemplified embodiment, chloroplast transformants are obtained by 
microprojectile bombardment of Chlamydomonas reinhardtii cells and subsequent 
selection on fosmidomycin. Specifically, a genecluster containing the complete 

1 5 mevalonate pathway is substituted, as a selectable marker, for the coding sequence of the 
aadA gene in the pUC18 derived vector containing 5-atpA:aadA:rbcL-3 
(Goldschmidt-Clermont M., Nucleic Acids Res. 19:4083-4089, 1991) as follows: 
Plasmid pUC-atpX-AAD is restricted withNcol, purified by GeneCleanand treated with 
Mung Bean nuclease to remove the resulting 5 1 overhangs. Following GeneClean 

20 purification, the blunt ended DNA fragment is restricted with Hindlll to remove the aadA 
orf and the remaining DNA fragment, containing approximately 653 base pairs of the C. 
reinhardtii atpA gene and approximately 437 base pairs of the C. reinhardtii rbcL gene 
(Goldschmidt-Clermont M., 1991), is isolated by agarose gel electrophoresis and purified 
by GeneClean. Plasmid pFH04 is restricted with Ndel, purified by GeneClean, and the 

25 resulting 5 overhangs are filled in with Klenow and dNTPs. Following GeneClean 
purification, the blunt ended DNA fragment is restricted with Hindlll and the resulting 
DNA fragment, containing Operon F (see Fig. 13), is isolated by agarose gel 
electrophoresis and purified by GeneClean. The blunt end-Hindlll fragment is inserted 
into the blunt end Hindlll sites of the DNA fragment isolated from pUC-atpX-AAD by 

30 ligation resulting in the orf encoding S. cerevisiae acetoacetylCoA thiolase, located at the 
beginning of Operon F, to be in frame with the ATG start codon of the 5atpA DNA in 
pUC-atpX-AAD (Goldschmidt-Clermont M., 1991). The resulting modified yeast orf 

T 

56 

021039BA2 t_> 



WO 02/10398 



PCT/US01/24037 



only encodes 2 extra amino acids, Met and Ser, appended to the N-terminal Met of the 
acetoacetylCoA thiolase polypeptide encoded by Operon F. The resulting 
chlamydomonas plastid transformation vector is designated pHK19. About 10,000 cells 
are spread on TAP plates containing 200 micromolar fosmidomycin , plates are dried, and 
5 then cells are immediately bombarded with Ml 0 or 1 micron gold particles coated with 
about 2 micrograms of plasmid DNA using the PDS-1 000 He gene gun , 1 1 00 psi, fourth 
shelf from bottom, ~ 2 cm gap, -28 mgHg vacuum (alternatively cells are spread over a 
Nytran nylon 0.45 micron membrane placed on top of TAP agar and bombarded without 
a drying phase). Plates are incubated in low light for two to three weeks before colonies 
10 are counted. Fosmidomycin-resistant colonies are green (vs yellowish for susceptible 
cells) and transformants are characterized using skills as known in the art. This 
demonstrates use of orfs encoding the entire mevalonate pathway as a selectable marker 
for green algae and by virtue of its functioning demonstrates its utility for overproduction 
of isoprenoid metabolites in microalgae. 

15 

Example 20: Metabolic engineering in transplastomic grain crops (rice) 

In another exemplified embodiment, an operon comprising orfs encoding the 
entire mevalonate pathway are inserted into the plastids of rice as follows: A DNA 
fragment isolated from pHK03, containing the complete mevalonate pathway, or from 

20 pFH02, containing orfs encoding the entire mevalonate pathway and IPP isomerase, is 
inserted into the Ncol-Xbal sites of plasmid pMSK49 to replace the gfp coding region 
adjacent to the coding region for streptomycin resistance, aadA; or inserted into the 
BstXI-Ncol digested DNA of plasmid pMSK48 using skills as is known in the art for 
direct selection on fosmidomycin. The resulting plasmids contain rice-specific insertion 

25 sequences of pMSK35 as described in Khan and Maliga, Nature Biotechnology 17: 
910-914, 1999. Embryonic suspensions, induced as previously described (Khan and 
Maliga 1999), of japonica rice Oryza sativa Taipei 309 1 engineered with the 
beta-carotene pathway (Ye et al Science 287:303-305) are plated into filter paper and 
bombarded with the PDS1000 He device as described in Example 17. After two days on 

30 non-selective medium and then one to two weeks in selective AA medium (Toriyama and 
Hinata, Plant Science 41: 179-183, 1985) tissue is transferred to agar solidified medium 
of MS salts, and vitamins, lOOmg/L myo-inositol, 4 mg/L 6-beiizylaininopurine, 0.5 mg/TL 
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indoleacetic acid, 0.5 mg/Ll-napthaleneacetic acide, 3% sucrose, 4% maltose and 100 
mg/L streptomycin sulfate or 500 |liM fosmidomycin. Transplastomic shoots appear 
following cultivation in the light after three weeks and leaf samples are analyzed for the 
operon by PCR. 
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Claims 

LA method of providing a cell with herbicide resistance comprising the steps 

of: 

providing a polynucleotide comprising polynucleotide sequences encoding 
the enzymes of the complete mevalonate pathway; 

introducing said polynucleotide into a plurality of target cells; 
contacting said cells with an herbicide that targets a component of a non- 
mevalonate pathway; and 

selecting at least one target cell which exhibits herbicide resistance. 

2. The method according to claim 1, wherein said target cell is a plant cell. 

3. The method according to claim 1, wherein said target cell is a microalgae cell. 

4. The method according to claim 2, wherein said polynucleotide is introduced 
into a plastid of said target cell. 

5. The method according to claim 3, wherein said polynucleotide is introduced 
into a plastid of said target cell. 

6. The method according to claim 1, wherein said polynucleotide further 
comprises a sequence encoding IPP isomerase. 

7. The method according to claim 2, wherein said polynucleotide comprises a 
sequence encoding IPP isomerase. 

according to claim 3, wherein said polynucleotide comprises a 
isomerase. 

9. The method according to claim 4, wherein said polynucleotide comprises a 
sequence encoding IPP isomerase. 



8. The method 
sequence encoding IPP 
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10. The method according to claim 5, wherein said polynucleotide comprises a 
sequence encoding IPP isomerase. 

11. An isolated polynucleotide encoding R. capsulatus IPP isomerase, said 
polynucleotide comprising the sequence of SEQ ID NO: 55. 

12. A method for producing a transformed plant comprising the steps of: 

providing a polynucleotide comprising polynucleotide sequences encoding 
the en2ymes of the complete mevalonate pathway; 

introducing said polynucleotide into a plurality of target plant cells; 
selecting at least one plant cell transformed with said polynucleotide; and 
regenerating said at least one plant cell into a transformed plant. 

13. A method according to claim 12, wherein said polynucleotide is introduced 
into a plastid of said target plant cell, and wherein said plant is a transplastomic plant. 

14. A plant produced by the method of claim 1 1 . 

15. A plant according to claim 14, wherein said plant is a transplastomic plant. 

16. A method for providing transformed cells having increased isoprenoid 
production as compared to non-transformed cells, comprising the steps of: 

providing an isolated polynucleotide comprising polynucleotide sequences 
encoding the enzymes of the complete mevalonate pathway; 
providing a plurality of target cells; 

introducing said isolated polynucleotide into said target cells; 
selecting target cells which have been transformed with said polynucleotide; and 
growing said transformed cells under conditions whereby additional generations 
of descendant transformed cells are produced, said transformed cells exhibiting increased 
isoprenoid production as compared to non-transformed cells of the same type. 

17. The method according to claim 16, wherein said isolated polynucleotide 
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further comprises the polynucleotide sequence encoding IPP isomerase. 

18. The method of claim 16, wherein said target cells are microalgae. 

19. The method of claim 17, wherein said target cells are microalgae. 

20. The method of claim 16, further comprising the step of regenerating said 
transformed cells into a transformed plant, wherein said transformed plant exhibits 
increased isoprenoid production as compared to a non-transformed plant of the same type. 

21. A plant produced by the method of claim 20. 

22. Descendants of the plant of claim 21, wherein said descendants exhibit 
increased isoprenoid production as compared to non-transformed plants of the same type. 

23. A method of providing a cell with antibiotic resistance comprising the steps 

of: 

providing a polynucleotide comprising polynucleotide sequences encoding 
the en2ymes of the complete mevalonate pathway; 

introducing said polynucleotide into a plurality of target cells; 
contacting said cells with an antibiotic that targets a component of a non- • 
mevalonate pathway; and 

selecting at least one target cell which exhibits antibiotic resistance. 

24. The method according to claim 23, wherein said target cell is a plant cell. 

25. The method according to claim 24, wherein said target cell is a microalgae 

cell. 

26. The method according to claim 24, wherein said polynucleotide is introduced : 
into a plastid of said target cell. 
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27. The method according to claim 25, wherein said polynucleotide is introduced 
into a plastid of said target cell. 

28. The method according to claim 23, wherein said polynucleotide further 
comprises a sequence encoding IPP isomerase. 

i 

29. The method according to claim 24, wherein said polynucleotide comprises 
a sequence encoding IPP isomerase. 

30. The method according to claim 25, wherein said polynucleotide comprises 
a sequence encoding IPP isomerase. 

3 1 . The method according to claim 26, wherein said polynucleotide comprises 
a sequence encoding IPP isomerase. 

32. The method according to claim 27, wherein said polynucleotide comprises 
a sequence encoding IPP isomerase. 

33. The method according to claim 1, wherein said polynucleotide comprises the 
polynucleotide sequence of SEQ ID NO: 58. 



34. The method according to claim 1, wherein said polynucleotide comprises the 
polynucleotide sequence of SEQ ID NO: 59. 

35. The method according to claim 1, wherein said polynucleotide comprises the 
polynucleotide sequence of SEQ ID NO: 60. 

36. The method according to claim 1, wherein said polynucleotide comprises the 
polynucleotide sequence of SEQ ID NO: 61 . 

37. The method according to claim 1 , wherein said polynucleotide comprises the 
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38. The method according to claim 1, wherein said polynucleotide comprises the 
polynucleotide sequence of SEQ ID NO: 63. 

39. The method according to claim 1, wherein said polynucleotide comprises the 
polynucleotide sequence of SEQ ID NO: 64. 

40. The method according to claim 1, wherein said polynucleotide comprises the 
polynucleotide sequence of SEQ ID NO: 72. 

41. The method according to claim 1, wherein said polynucleotide comprises the 
polynucleotide sequence of SEQ ID NO: 73. 

42. The method according to claim 1, wherein said polynucleotide comprises the 
polynucleotide sequence of SEQ ID NO: 74. 

43. The method according to claim 1 3 wherein said polynucleotide comprises the 
polynucleotide sequence of SEQ ID NO: 76. 

44. The method according to claim 12, wherein said polynucleotide comprises 
the polynucleotide sequence of SEQ ID NO: 58. 

45. The method according to claim 12, wherein said polynucleotide comprises 
the polynucleotide sequence of SEQ ID NO: 59. 

46. The method according to claim 12, wherein said polynucleotide comprises 
the polynucleotide sequence of SEQ ID NO: 60. 

47. The method according to claim 12, wherein said polynucleotide comprises 
the polynucleotide sequence of SEQ ID NO: 61. 
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48. The method according to claim 12, wherein said polynucleotide comprises 
the polynucleotide sequence of SEQ ID NO: 62. 

49. The method according to claim 12, wherein said polynucleotide comprises 
the polynucleotide sequence of SEQ ID NO: 63. 

50. The method according to claim 12, wherein said polynucleotide comprises 
the polynucleotide sequence of SEQ ID NO: 64. 

51 . The method according to claim 12, wherein said polynucleotide comprises 
the polynucleotide sequence of SEQ ID NO: 72. 

52. The method according to claim 12, wherein said polynucleotide comprises 
the polynucleotide sequence of SEQ ID NO: 73 . 

53. The method according to claim 12, wherein said polynucleotide comprises 
the polynucleotide sequence of SEQ ID NO: 74. 

54. The method according to claim 12, wherein said polynucleotide comprises 
the polynucleotide sequence of SEQ ID NO: 76. 

55. The method according to claim 16, wherein said polynucleotide comprises 
the polynucleotide sequence of SEQ ID NO : 5 8 . 

56. The method according to claim 16, wherein said polynucleotide comprises 
the polynucleotide sequence of SEQ ID NO: 59. 

57. The method according to claim 16, wherein said polynucleotide comprises 
the polynucleotide sequence of SEQ ID NO: 60. 

* 58. The method according to claim 1 6, wherein said polynucleotide comprises 
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the polynucleotide sequence of SEQ ID NO: 61 . 

59. The method according to claim 16, wherein said polynucleotide comprises 
the polynucleotide sequence of SEQ ID NO: 62. 

60. The method according to claim 16, wherein said polynucleotide comprises 
the polynucleotide sequence of SEQ ID NO: 63. 



61 . The method according to claim 16, wherein said polynucleotide comprises 
the polynucleotide sequence of SEQ ID NO: 64. 

62. The method according to claim 16, wherein said polynucleotide comprises 
the polynucleotide sequence of SEQ ID NO: 72. 

63. The method according to claim 16, wherein said polynucleotide comprises 
the polynucleotide sequence of SEQ ID NO: 73. 

64. The method according to claim 16, wherein said polynucleotide comprises 
the polynucleotide sequence of SEQ ID NO: 74. 

65. The method according to claim 16, wherein said polynucleotide comprises 
the polynucleotide sequence of SEQ ID NO: 76. 

66. The method according to claim 23, wherein said polynucleotide comprises 
the polynucleotide sequence of SEQ ID NO: 58. 

67. The method according to claim 23, wherein said polynucleotide comprises 
the polynucleotide sequence of SEQ ID NO: 59. 

68. The method according to claim 23, wherein said polynucleotide comprises 
the polynucleotide sequence of SEQ ID NO: 60. 
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69. The method according to claim 23, wherein said polynucleotide comprises 
the polynucleotide sequence of SEQ ID NO: 61. 



70. The method according to claim 23, wherein said polynucleotide comprises 
the polynucleotide sequence of SEQ ID NO: 62. 

71. The method according to claim 23, wherein said polynucleotide comprises 
the polynucleotide sequence of SEQ ID NO: 63 . 

72. The method according to claim 23, wherein said polynucleotide comprises 
the polynucleotide sequence of SEQ ID NO: 64. 

73. The method according to claim 23, wherein said polynucleotide comprises 
the polynucleotide sequence of SEQ ID NO: 72. 

74. The method according to claim 23, wherein said polynucleotide comprises 
the polynucleotide sequence of SEQ ID NO: 73. 

75. The method according to claim 23, wherein said polynucleotide comprises 
the polynucleotide sequence of SEQ ID NO: 74. 

76. The method according to claim 23, wherein said polynucleotide comprises 
the polynucleotide sequence of SEQ ID NO: 76. 

77. An isolated polynucleotide comprising polynucleotide sequences encoding 
the enzymes of the complete mevalonate pathway, said polynucleotides comprising a 
sequence selected from the group consisting of SEQ ID NO: 58, SEQ ID NO: 59, SEQ 
ID NO: 60, SEQ ID NO: 61, SEQ ID NO: 62, SEQ ID NO: 63, SEQ ID NO: 64, SEQ ID 
NO: 72, SEQ ID NO: 73, SEQ ID NO: 74, and SEQ ID NO: 76. 



* 72 



. BNSDOCID: <WO 021039BA2 I > 



WO 02/10398 PCT/US01/24037 

78. An isolated polynucleotide according to claim 77, wherein said 
polynucleotide comprises the sequence of SEQ ID NO: 58. 

79. An isolated polynucleotide according to claim 77, wherein said 
polynucleotide comprises the sequence of SEQ ID NO: 59. 

80. An isolated polynucleotide according to claim 77, wherein said 
polynucleotide comprises the sequence of SEQ ID NO: 60. 

81. An isolated polynucleotide according to claim 77, wherein said 
polynucleotide comprises the sequence of SEQ ID NO : 6 1 . 

82. An isolated polynucleotide according to claim 77, wherein said 
polynucleotide comprises the sequence of SEQ ID NO: 62. 

83. An isolated polynucleotide according to claim 77, wherein said 
polynucleotide comprises the sequence of SEQ ID NO: 63. 

84. An isolated polynucleotide according to claim 77, wherein said 
polynucleotide comprises the sequence of SEQ ID NO: 64. 

85. An isolated polynucleotide according to claim 77, wherein said 
polynucleotide comprises the sequence of SEQ ID NO: 72. 

86. An isolated polynucleotide according to claim 77, wherein said 
polynucleotide comprises the sequence of SEQ ID NO: 73. 

87. An isolated polynucleotide according to claim 77, wherein said 
polynucleotide comprises the sequence of SEQ ID NO: 74. 

88. An isolated polynucleotide according to claim 77, wherein said 
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polynucleotide comprises the sequence of SEQ ID NO: 76. 

89. An isolated polynucleotide comprising the sequence of SEQ ID NO: 75. 

90. A method of providing a cell with an inserted polynucleotide sequence 
encoding one or more products of interest comprising the steps of: 

providing a plurality of target cells having an identified pseudogene site therein; 

providing an isolated polynucleotide comprising polynucleotide sequences of said 
pseudogene site flanking at least one coding sequence of interest; 

introducing said polynucleotide into a plurality of said target cells; 

selecting at least one target cell which contains the coding sequence of interest 
inserted into said pseudogene site . 

91. The method according to claim 90, wherein said pseudogene is a defunct 
gene located in an active operon from which monocistronic or polycistronic RNA is 
produced. 

92. The method according to claim 91, wherein said operon is the rpl23 operon 

93. The method according to claim 91, wherein said pseudogene is infA. 

94. The method according to claim 90, wherein the inserted polynucleotide is 
operably linked to the regulatory sequences of the pseudogene. 

95. The method according to claim 94, wherein the inserted polynucleotide is 
operably linked to the regulatory sequences of the rpl23 operon. . 

96. The method according to claim 94, wherein the inserted polynucleotide is 
operably linked to the regulatory sequences of infA. 
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97. The method according to claim 90, wherein the isolated polynucleotide 
further comprises additional flanking sequences that themselves flank the pseudogene 
sequences, and wherein said additional flanking sequences, in their native state, flank the 
pseudogene in its native state. 

98. The method according to claim 97, wherein the inserted polynucleotide 
replaces the pseudogene in its entirety. 

99. The method according to claim 97, wherein said additional flanking sequences 
are native plastid sequences. 

100. The method according to claim 90, wherein said target cell is a plant cell. 

101. The method according to claim 90, wherein said target cell is a microalgae 

cell. 

102. The method according to claim 100, wherein said polynucleotide is 
introduced into a plastid of said target cell. 

103. The method according to claim 101, wherein said polynucleotide is 
introduced into a plastid of said target cell. 

104. The method according to claim 100, wherein said plant cell is selected from 
the group consisting of the rosids, asterids, and liliales. 

105. The method according to claim 100, wherein said plant cell is from a 
solanaceous species. 

106. The method according claim 105, wherein said plant cell is selected from 
the group consisting of petunia, tomato, potato, and tobacco cells. 
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107. The method according to claim 90, wherein said coding sequence of interest 
comprises polynucleotide sequences encoding the enzymes of the complete mevalonate 
pathway. 

108. The method according to claim 90, wherein said polynucleotide further 
comprises a sequence encoding IPP isomerase. 

109. The method according to claim 107, wherein said polynucleotide further 
comprises a sequence encoding IPP isomerase. 

110. The method according to claim 90, wherein said polynucleotide comprises 
polynucleotide sequences encoding phytoene synthase. 

111. The method according to claim 90, wherein said polynucleotide is 
promoterless. 

112. A method according to any of claims 100, 102, 104, 105, and 106, said 
method further comprising the step of regenerating said selected target cell into a plant, 
said plant comprising said coding sequence of interest. 

1 13. A plant produced by the method of claim 1 12. 

1 14. Descendants of the plant of claim 1 13, said descendant plants comprising 
said coding sequence of interest. 
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1 

SEQUENCE LISTING 

<110> Hahn, Frederick 

Kuehnle, Adelheid 



<120> Manipulation of genes of the mevalonate and isoprenoid pathways to 
create novel traits in transgenic organisms 



<130> KAS-103XC1 

<150> 60/221,703 

<151> 2000-07-31 

<160> 76 

<170> Patentln version 3.0 

<210> 1 

<211> 57 

<212> DNA 

<213> Artificial Sequence 
<220> 

<223> PGR primer containing Saccharomyces cerevisiae DNA 

\ 

<400> 1 

ggactagtct gcaggaggag ttttaatgtc attaccgttc ttaacttctg caccggg 57 

<210> 2 

<211> 96 

<212> DNA 

<213> Artificial Sequence 
<220> 

<223> PCR primer containing S. cerevisiae DNA 

<400> 2 
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ttctcgagct taagagtagc aatatttacc ggagcagtta cactagcagt atatacagtc 60 

attaaaactc ctcctgtgaa gtccatggta aattcg 96 

<210> 3 

<211> 56 

<212> DNA 

<213> Artificial Sequence 
<220> 

<223> PCR primer containing S. cerevisiae DNA 

<400> 3 

tagcggccgc aggaggagtt catatgtcag agttgagagc cttcagtgcc ccaggg 56 

<210> 4 
<211> 36 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> PCR primer containing S. cerevisiae DNA 
<400> 4 

tttctgcagt ttatcaagat aagtttccgg atcttt 36 

<210> 5 

<211> 41 

<212> DNA 

<213> Artificial Sequence 
<220> 

<223> PCR primer containing S. cerevisiae DNA 

<400> 5 

ggaattcatg accgtttaca cagcatccgt taccgcaccc g 41 

<210> 6 

<211> 45 

<212> DNA 

<213> Artificial Sequence 
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<220> 

<223> PCR primer containing S. cerevisiae DNA 
<400> 6 

ggctcgagtt aaaactcctc ttcctttggt agaccagtct ttgcg 45 

<210> 7 
<211> 68 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> PCR primer containing Arabidopsis thaliana DNA 
<400> 7 

gctctagatg cgcaggaggc acatatggcg aagaacgttg ggattttggc tatggatatc 60 
tatttccc 58 

<210> 8 

<211> 61 

<212> DNA 

<213> Artificial Sequence 
<220> 

<223> PCR primer containing A. thaliana DNA ' 



<400> 8 

cgctcgagtc gacggatcct cagtgtccat tggctacaga tccatcttca cctttcttgc 60 



61 



<210> 9 

<211> 72. 

<212> DNA 

<213> Artificial Sequence ' 
<220> 

<223> PCR primer containing A. thaliana DNA 

<400> 9 
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<223> Oligonucleotide containing S. cerevisiae DNA 
<400> 16 

cgctcgagcc cgggggatcc ttagcaacga tgaattaagg tatcttggaa ttttgacgc 59 
<210> 17 
<211> 6215 
<212> DNA 

<213> Artificial Sequence 
<220> 

<221> misc_feature 
<222> ()..() 

<223> Vector pBSNT27 containing Nicotiana tabacum DNA 



<400> 17 
gcacttttcg 


gggaaatgtg 


cgcggaaccc 


ctatttgttt 


atttttctaa atacattcaa 


60 


atatgtatcc 


gctcatgaga 


caataaccct 


gataaatget 


tcaataatat tgaaaaagga 


i20 " 


agagtatgag 


tattcaacat 


ttccgtgtcg 


cccttattcc 


ettttttgeg geattttgee 


180 


ttcctgtttt 


tgctcaccca 


gaaacgctgg 


tgaaagtaaa 


agatgetgaa gatcagttgg 


240- 


gtgcacgagt 


gggttacatc 


gaactggatc 


teaacagegg 


taagatcctt gagagttttc 


300 


gccccgaaga 


acgttttcca 


atgatgagca 


cttttaaagt 


tetgetatgt ggegeggtat 


360 


tatcccgtat 


tgacgccggg 


caagagcaac 


tcggtcgccg 


catacactat tctcagaatg 


420 


acttggttga 


gtact caeca 


gtcacagaaa 


agcatcttac 


ggatggcatg acagtaagag 


480 


aattatgcag 


tgetgecata 


accatgagtg 


ataacactgc 


ggecaactta cttctgacaa 


540 


cgatcggagg 


accgaaggag 


ctaaccgctt 


ttttgcacaa 


catgggggat catgtaactc 


600 


gccttgatcg 


ttgggaaccg 


gagctgaatg 


aagccatacc 


aaacgacgag cgtgacacca 


660 


cgatgcctgt 


agcaatggca 


acaacgttgc 


gcaaactatt 


aactggcgaa ctacttactc 


720 


tagcttcccg 


gcaacaatta 


atagactgga 


tggaggcgga 


taaagttgca ggaccacttc 


780 


tgcgctcggc 


ccttccggct 


ggctggttta 


ttgctgataa 


atetggagee ggtgagcgtg 


840 


ggtctcgcgg 


tatcattgea 


gcactggggc 


cagatggtaa 


gccctcccgt ategtagtta 


900 


tctacacgac 


ggggagtcag 


gcaactatgg 


atgaacgaaa 


tagacagatc gctgagatag 


960 


gtgcctcact 


gattaagcat 


tggtaactgt 


cagaccaagt 


ttactcatat atactttaga 


1020 
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ttgatttaaa 


acttcatttt 


taatttaaaa 


ggatctaggt 


gaagatcctt 


tttgataatc 


1080 


tcatgaccaa 


aatcccttaa 


cgtgagtttt 


cgttccactg 


agcgtcagac 


cccgtagaaa 


1140 


agatcaaagg 


atcttcttga 


gatccttttt 


ttctgcgcgt 


aatctgctgc 


ttgcaaacaa 


1200 


aaaaaccacc 


gctaccagcg 


gtggtttgtt 


tgccggatca 


agagctacca 


actctttttc 


1260 


cgaaggtaac 


tggcttcagc 


agagcgcaga 


taccaaatac 


tgtccttcta 


gtgtagccgt 


1320 


agttaggcca 


ccacttcaag 


aactctgtag 


caccgcctac 


atacctcgct 


ctgctaatcc 


1380 


tgttaccagt 


ggctgctgcc 


agtggcgata 


agtcgtgtct 


taccgggttg 


gactcaagac 


1440 


gatagttacc 


ggataaggcg 


cagcggtcgg 


gctgaacggg 


gggttcgtgc 


acacagccca 


1500 


gcttggagcg 


aacgacctac 


accgaactga 


gatacctaca 


gcgtgagcta 


tgagaaagcg 


1560 


ccacgcttcc 


cgaagggaga 


aaggcggaca 


ggtatccggt 


aagcggcagg 


gtcggaacag 


1620 


gagagcgcac 


gagggagctt 


ccagggggaa 


acgcctggta 


tctttatagt 


cctgtcgggt 


1680 


ttcgccacct 


ctgacttgag 


cgtcgatttt 


tgtgatgctc 


gtcagggggg 


cggagcctat 


1740 


ggaaaaacgc 


cagcaacgcg 


gcctttttac 


ggttcctggc 


cttttgctgg 


ccttttgctc 


1800 


acatgttctt 


tcctgcgtta 


tcccctgatt 


ctgtggataa 


ccgtattacc 


gcctttgagt 


1860 


gagctgatac 


cgctcgccgc 


agccgaacga 


ccgagcgcag 


cgagtcagtg 


agcgaggaag 


1920 


cggaagagcg 


cccaatacgc 


aaaccgcctc 


tccccgcgcg 


ttggccgatt 


cattaatgca 


1980 


gctggcacga 


caggtttccc 


gactggaaag 


cgggcagtga 


gcgcaacgca 


attaatgtga 


2040 


gttagctcac 


tcattaggca 


ccccaggctt 


tacactttat 


gcttccggct 


cgtatgttgt 


2100 


gtggaattgt 


gagcggataa 


caatttcaca 


caggaaacag 


ctatgaccat 


gattacgcca 


2160 


agctcgaaat 


taaccctcac 


taaagggaac 


aaaagctgga 


gctccaccgc 


ggtggcggcc 


2220 


gctctagaac 


tagtggatct 


tcttggctgt 


tattcaaaag 


gtccaacaat 


gtatatatat 


2280 


tggacatttt 


gaggcaatta 


tagatcctgg 


aaggcaattc 


tgattggtca 


ataaaaatcg 


2340 


atttcaatgc 


tatttttttt 


ttgtttttta 


tgagtttagc 


caatttatca 


tgaaaggtaa 


2400 


aaggggataa 


aggaaccgtg 


tgttgattgt 


cctgtaaata 


taagttgtct 


tcctccatat 


2460 


gtaaaaaggg 


aataaataaa 


tcaattaaat 


ttcgggatgc 


ttcatgaagt 


gcttctttcg 


2520 


gagttaaact 


tccgtttgtc 


catatttcga 


gaaaaagtat 


ctcttgtttt 


tcattcccat 


2580 


tcccataaga 


atgaatacta 


tgattcgcgt 


ttcgaacagg 


catgaataca 


gcatctatag 


2640 


gataacttcc 


atcttgaaag 


ttatgtggcg 


tttttataag 


atatccacga 


tttctctcta 


2700 


tttgtaatcc 


aatacaaaaa 


tcaattggtt 


ccgttaaact 


ggctatatgt 


tgtgtattat 


2760 


caacgatttc 


tacataaggc 


ggcaagatga 


tatcttgggc 


agttacagat 


ccaggaccct 


2820 


tgacacaaat 


agatgcgtca 


gaagttccat 


atagattact 


tcttaatata 


atttctttca 


2880 


aattcattaa 


aatttcatgt 


accgattctt 


gaatgcccgt 


tatggtagaa 


tattcatgtg 


2940 
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ggactttctc 


agattttaca 


cgi.gLgaT.ac 




4-r»4--}-4-r'+-/-'/— ^ 
L. Ct L L LULL-Ua 




JUUU 


ttcgcatcgc 


aatgcctatt 


gugucggct l 


ggccx. utCdL 


aa(j Lyyayao 


dyad LaaayL 


JUUU 


gtccataata 


aaggcgttta 


ctgtctgttc 


rLgamcaac 


aCaCLUCCaU 


4- rr4- a rr"h <nr"H r»r^ 




gagtagatac 


tgttact ttc 


tctcgaacca 


tiag Laccax. c 


dCttyaLLay 


-3 4- /-> -3 4- /— > /-r rj -5 4- 


^ t on 


cttttatttc 


tcttgagatt 


4-_4-4- n -^.-.4- A ,4- 

tctxcaatgt 


tcagttctac 


acacgt cttt 


4~ 4- 4" 4— /~* /~Y —5 /-^* r-y 

Ltttcgyagy 




tctacagcca 


ttatgtggca 


taggagttac 


atcccgtacg 


aaagttaata 


gna izaccaci- 


jjUU 


tcgacgaata 


gctcgtaatg 


ctgcatctct 


tccgagaccg 


ggaccLuuT-a 


LCdUgdLLLC 




tgctcgttgc 


ataccttgat 


ccactactgt 


acggatagcg 


tttgctgctg 


cggu utgagc 


^4 9n 


agcaaacggt 


gttcctcttc 


tcgtaccttt 


gaatccagaa 


gtaccggcgg 


aggaccaaga 




aactactcga 


ccccgtacat 


ctgtaacagt 


gacaatggta 


4— 4~ 4- 4- *n a/"* 


4- 4~ rT 4" /~r zi 

LLgcULyadC 




atgaataact 


ccctttggta 


ttctacgtgc 


acccttacgt 


gaaccaat ac 


ytCCdLtCCL 


JuUU 


acgcgaacta 


attttcggta 


tagcttttgc 


catattttat 


catctcgtaa 


atatgagtca 




gagatatatg 


gatatatcca 


tttcatgtca 


aaacagattc 


4-4-4- 4- 4- 4- /-»+- -i 

tttatttgta 


catcggctct 


O / U 


tctggcaagt 


ctgattatcc 


ctgtctttgt 


ttatgtctcg 


ggttggaaca 


aattactata 


J/OU 


attcgtcccc 


gcctacggat 


tagtcgacat 


ttttcacaaa 


ttttacgaac 


ggaaguLCuL 


O O fi U 


attttcatat 


ttctcattcc 


ttaccttaat 


tctgaatcta 


tttcttggaa 


gaaaataagt 


uu 


ttcttgaaat 


ttttcatctc 


gaattgtatt 


cccacgaaag 


gaatggtgaa 


gttgaaaaac 




gaatccttca 


aatctttgtt 


gtggagtcga 


taaattatac 


gccctttggt 


ugaa l ca raa 


4 09 n 


ggacttactt 


caattttgac 


tctatctcct 


ggcagtatcc 


gcaLaaaacu 


dLyecyydtc 


4DP0 


tttcctgaaa 


cataatttat 


aatcagatct 


aaacaaaccc 


ggaacagacc 


gt cy y gdag c 




gattcagtaa 


ttaaagcttc 


atgactcctt 


4-4-4- _-~4- 4- 4- 

tttggutcti: 


aaag tcccu u 


i_yd.yy Lauuci 


4 200 


actaataaga 


aagatattag 


acaacccccc 


4.+. 4-4-4- 4- r-t+-4- +■ 


4— 4- i— * /-» *^ 4" a 


y y ctcty l. l. L.v^y 


4260 


aatccaattt 


ggatattaaa 


aggattacca 


gatataacac 


aaaatctctc 


f-% —1 ^1 /-^ 4— — \ t* 4— /—1 

cacctaLucc 


4^90 


ttctagtcga 


gcctctcggt 


ctgtzcattat: 


acctcgagaa 


gtagaaagaa 


LLdOda LLOO 


4 0 

y JO u 


cattccacct 


aaaattcgcg 


gaattcgttg 


ataattagaa 


tagautcgta 


gaccaggtcg 


4 4 4 0 


actgattcgt 


tttaaattta 


aaatatttct 


_ x_ _ _4_ —,4- 4_ 

atagggtctt 


ttcctattcc 


4- 4- /-i 4- o 4- /t4- /— < /-r 

llcl atytcg 


4^00 


cagggttaaa 


accaaaaaat 


atttgttttt 


4- 4- — 4- _ — 4- 

ttctcgatgt 


4- 4_ 4_ _4- _ _ _i- 

tttctcacgt: 


tttcgauaaa 


4 RfiO 


accttctcgt 


aaaagtattt 


gaacaatatt 


ttcggtaata 


ttagtagatg 


ctattcgaac 


4620 


cacccttttt 


cgatccatat 


cagcatttcg 


tatagaagtt 


attatctcag 


caatagtgtc 


4680 


cctacccatg 


atgaactaaa 


attat'tgggg 


cctccaaatt 


tgatataatc 


aacgtgtttt 


4740 


ttacttattt 


tttttttgaa 


tatgatatga 


attattaaag 


atatatgcgt 


gagacacaat 


4 800 


ctactaatta 


atctatttct 


ttcaaatacc 


ccactagaaa 


cagatcacaa 


tttcatttta 


4860 
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taatarrt ret 


rrrr^i rrrl" aof rr 
yyayutaauy 


addC L a L L L L 


agtaaaattt aattctctca attcccgggc 


a c\ o r\ 

4 920 


Cfattcrcacca 


uGoa l. tv^y ay 


ft- ppi-hl--*- n n 

LLLLULLiyd 


tttccttcct tcttgatcaa 


taacaactgc 


4 980 


agcattgtca 


t catatcota 


L. d l_ V_» d LULL 


gttgtcacgt ttgagttctt tacaggtccg 


DU4U 


cacaal't's pa 


rrr^"h r ,- h rr a p"H a 
you^L.ydi^i_d 




ttctaggggc atatttggta 


eggcttcttt 


5100 


GatpapAfipa 


d odd L. ddty L 


r** /-** ^ n 4- -i 4- >■» 

CdCCaaLdUy 


agcatatcga egattgetag 


ctcctatgat 


5160 


v^>y ci a LOUQL 


Ct LLdd L LL-LL 


f~X ""3 /^f *^"t /-^i » 4"" 

yayCOCCy cu 


gttatccget acatttaaat 


gggtctgagg 


5220 


+■ +• era a 4- paf4- 


+-4-4-4-4- -j = 4-/-,i->, 
LLLLLaaLCC 


i*t4- 4-> ^ 4— 4- 4» — \ 

yutctLtgaa 


tgeaaaggge gaagaaaaaa 


aagaaatatt 


5280 




aaaaaagaaa 


catgcggttt 


cgtttcatat ctaagagccc tttcegcatt 


5340 




acau uacgaa 


ataatgaatt 


gagttegtat aggcatttta 


gatgetgeta 


5400 


y Ly add LcllJO 


/"» /~t 4- 4- ^»4" /"f rx 4- 

ccLLCLyycu 


•\ 4- 4* ^ 4* 4— ^* 4- *-*t 

aiattLtctg 


ttactccacc catttcataa 


agtattcgac 


5460 


/"~TTY 4~ 4- 4- —j — i /-< 


aacagctacc 


caat att cag 


gggatccccc gggctgeagg 


aattcgatat 


5520 


P3 ;3 f +- +■ -3 -f- 
taaytL. L. a LL> 


yaLdCCy Lug 


acctcgaggg 


ggggcccggt acccaattcg 


ccctatagtg 


5580 


ay LLy LaLta 


^» q ^ 4* 4" 4* f+ 
CaaLLCaCCg 


gccgtcgttt 


tacaaegteg tgactgggaa 


aaccctggcg 


5640 


tt"APPP3APt" 


LddLOyuLL U 


gccigccica uc 


cccctttcgc cagctggcgt 


aatagcgaag 


5700 




Oyd LuyOuUL 


f% ^ ^ ^ /^r 4" 

LCCCaaCayl. 


tgcgcagcct gaatggcgaa 


tgggacgege 


5760 




/— » f»f /— 1 — J 4- 4- n/yrt 


gcggcgggtg 


tggtggttac. gcgcagcgtg 


accgctacac 


5820 


t* +" ctcT 1 ?* cirri r* 


LLLdyuyc CC 


gcL.cct.tTi eg 


ctttcttccc ttcctttctc 


gccacgttcg 


5880 


/~» /—i rrrv /^4~ 4- 4- /~*t~* 

uucjycLttcc 


ccgt caagct 


etaaateggg 


ggctcccttt agggttccga tttagtgctt 


5940 


tacggcacct 


cgaccccaaa 


aaacttgatt 


agggtgatgg ttcacgtagt gggccatcgc 


6000 


cctgatagac 


ggtttttcgc 


ectttgaegt 


tggagtccac gttctttaat 


agtggactct 


6060 


tgttccaaac 


tggaacaaca 


ctcaacccta 


teteggtcta ttcttttgat ttataaggga 


6120 


ttttgecgat 


tteggectat 


tggttaaaaa 


atgagctgat ttaacaaaaa 


tttaacgega 


6180 


attttaacaa. 


aatattaacg 


cttacaattt 


aggtg 




6215 



18 

1332 
DNA 

Artificial Sequence 

Oligonucleotide containing N. tabacum and S. cerevisiae DNA 
<400> 18 



<210> 
<211> 
<212> 
<213> 
<220> 
<223> 



BNSDOCID: <WO^ 0210398A2. 1. > 



WO 02/10398 



PCT/USO 1/24037 



atgtcattac 


cgttcttaac 


i ft 

ttctgcaccg ggaaaggtta ttatttttgg tgaacactct 


60 


gctgtgtaca 


acaagcctgc 


cgtcgctgct agtgtgtctg cgttgagaac ctacctgcta 


ion 


ataagcgagt 


catctgcacc 


agatactatt gaattggact tcccggacat tagctttaat 


180 


cataagtggt 


ccatcaatga 


tttcaatgcc atcaccgagg aucaagcaaa cucccaaaaa 




ttggccaagg 


ctcaacaagc 


caccgatggc ttgtctcagg aactcgttag tcttttggafc 




ccgttgttag 


ctcaactatc 


cgaatcctuc cactaccang cagcguL-T-L-g LtLccigtaL 




atgtttgttt 


gcctatgccc 


ccatgccaag aatattaagt tttctttaaa gtctacttta 




cccatcggtg 


ctgggttggg 


ctcaagcgcc tctatttctg tatcactggc cttagctatg 


4 80 


gcctacttgg 


gggggttaat 


aggatctaat gacttggaaa agctgtcaga aaacgataag 


540 


catatagtga 


atcaatgggc 


cttcataggt gaaaagtgta ttcacggtac cccttcagga 


n r\ 

600 


atagataacg 


ctgtggccac 


ttatggtaat gccctgctat ttgaaaaaga ctcacataat 


660 


ggaacaataa 


acacaaacaa 


ttttaagttc ttagatgatt tcccagccat tccaatgatc 


720 


ctaacctata 


ctagaattcc 


aaggtctaca aaagatcttg ttgctcgcgt tcgtgtgttg 


1 O A 

/o0 


gtcaccgaga 


aatttcctga 


agttatgaag ccaattctag atgccatggg tgaatgtgcc 


o a r\ 

840 


ctacaaggct 


tagagatcat 


gactaagtta agtaaatgta aaggcaccga tgacgaggct 


yoo 


gtagaaacta 


ataatgaact 


gtatgaacaa ctattggaat tgataagaat aaatcatgga 


960 


ctgcttgtct 


caatcggtgt 


ttctcatcct ggattagaac ttattaaaaa tctgagcgat 


1 n O n 


gatttgagaa 


ttggctccac 


aaaacttacc ggtgctggtg gcggcggttg ctctttgact 


1 OR fi 
luoU 


ttgttacgaa 


gagacattac 


tcaagagcaa attgacagct tcaaaaagaa attgcaagat 


1 1 /in 


gattttagtt 


acgagacatt 


tgaaacagac ttgggtggga ctggctgctg tttgttaagc 


1200 


gcaaaaaatt 


tgaataaaga 


tcttaaaatc aaatccctag tattccaatt atttgaaaat 


1260 


aaaactacca 


caaagcaaca 


aattgacgat ctattattgc caggaaacac gaatttacca 


1320 


tggacttcat 


aa 




1332 



<210> 19 
<211> 1191 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Oligonucleotide containing N. tabacum and A. thaliana DNA 

1 

<400> 19 

atgaccgttt acacagcatc cgttaccgca cccgtcaaca tcgcaaccct taagtattgg 60 
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gggaaaaggg 


acacgaagtt 


gaatctgccc accaattcgt 


ccatatcagt gactttatcg 


120 


caagatgacc 


tcagaacgtt 


gacctctgcg gctactgcac 


ctgagtttga acgcgacact 


180 


ttgtggttaa 


atggagaacc 


acacagcat c 


gacaatgaaa 


gaactcaaaa ttgtctgcgc 


240 


gacctacgcc 


aattaagaaa 


ggaaatggaa 


tcgaaggacg 


cctcattgcc cacattatct 


300 


caatggaaac 


tccacattgt 


ctccgaaaat 


aactttccta 


cagcagctgg tttagcttcc 


360 


tccgctgctg 


gctttgctgc 


attggtctct 


gcaattgcta 


agttatacca attaccacag 


420 


tcaacttcag 


aaatatctag 


aatagcaaga 


aaggggtctg 


gttcagcttg tagatcgttg 


480 


tttggcggat 


acgtggcctg 


ggaaatggga 


aaagctgaag 


atggtcatga ttccatggca 


540 


gtacaaatcg 


cagacagctc 


tgactggcct 


cagatgaaag 


cttgtgtcct agttgtcagc 


600 


gatattaaaa 


aggatgtgag 


ttccactcag 


ggtatgcaat 


tgaccgtggc aacctccgaa 


660 


ctatttaaag 


aaagaattga 


acatgtcgta 


ccaaagagat 


ttgaagtcat gcgtaaagcc 


720 


attgttgaaa 


aagatttcgc 


cacctttgca 


aaggaaacaa 


tgatggattc caactctttc 


780 


catgccacat 


gtttggactc 


tttccctcca 


atattctaca 


tgaatgacac ttccaagcgt 


840 


atcatcagtt 


ggtgccacac 


cattaatcag ttttacggag 


aaacaatcgt tgcatacacg 


900 


tttgatgcag 


gtccaaatgc 


tgtgttgtac tacttagctg 


aaaatgagtc gaaactcttt 


■ 960 


gcatttatct 


ataaattgtt 


tggctctgtt 


cctggatggg 


acaagaaatt tactactgag 


1020 


cagcttgagg 


ctttcaacca 


tcaatttgaa tcatctaact 


ttactgcacg tgaattggat 


1080 


cttgagttgc 


aaaaggatgt 


tgccagagtg 


attttaactc 


aagtcggttc aggcccacaa 


1140 


gaaacaaacg 


aatctttgat 


tgacgcaaag actggtctac 


caaaggaata a . 


1191 



<210> 20 
<211> 1197 
<212> DNA 

<213> Artificial Sequence 

<220> ... 
<223> PCR primer containing Rhodobacter capsulatus DNA 
<400> 20 

atgtctcaga acgtttacat tgtatcgact gccagaaccc caattggttc attccagggt 60 
tctctatcct ccaagacagc agtggaattg ggtgctgttg ctttaaaagg cgccttggct 120 
aaggttccag aattggatgc atccaaggat tttgacgaaa ttatttttgg taacgttctt 180 
tctgccaatt tgggccaagc tccggccaga caagttgctt tggctgccgg tttgagtaat 240 
catatcgttg caagcacagt taacaaggtc tgtgcatccg ctatgaaggc aatcattttg 300 
ggtgctcaat ccatcaaatg tggtaatgct gatgttgtcg tagctggtgg ttgtgaatct 360 
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atgactaacg 


caccatacta 


catgccagca 


gcccgtgcgg 


gtgccaaatt 


tggccaaact 


420 


gttcttgttg 


atggtg.tcga 


aagagatggg 


ttgaacgatg 


cgtacgatgg 


tctagccatg 


480 


ggtgtacacg 


cagaaaagtg 


tgcccgtgat 


tgggatatta 


ctagagaaca 


acaagacaat 


540 


tttgccatcg 


aatcctacca 


aaaatctcaa 


aaatctcaaa 


aggaaggtaa 


attcgacaat 


600 


gaaattgtac 


ctgttaccat 


taagggattt 


agaggtaagc 


ctgatactca 


agtcacgaag 


660 


gacgaggaac 


ctgctagatt 


acacgttgaa 


aaattgagat 


ctgcaaggac 


tgttttccaa 


720 


aaagaaaacg 


gtactgttac 


tgccgctaac 


gcttctccaa 


tcaacgatgg 


tgctgcagcc 


780 


gtcatcttgg 


tttccgaaaa 


agttttgaag 


gaaaagaatt 


tgaagccttt 


ggctattatc 


840 


aaaggttggg 


gtgaggccgc 


tcatcaacca 


gctgatttta 


catgggctcc 


atctcttgca 


900 


gttccaaagg 


ctttgaaaca 


tgctggcatc 


gaagacat ca 




L LaUL L Lyaa 




ttcaatgaag 


ccttttcggt 


tgtcggtttg 


gtgaacacta 


agattttgaa 


gctagaccca 


1020 


tctaaggtta 


atgtatatgg 


tggtgctgtt 


gctctaggtc 


acccattggg 


ttgttctggt 


1080 


gctagagtgg 


ttgttacact 


gctatccatc 


ttacagcaag 


aaggaggtaa 


gatcggtgtt 


1140 


gccgccattt 


gtaatggtgg 


tggtggtgct 


tcctctattg 


tcattgaaaa 


gatatga 


1197 


<210> 21 














<211> 1386 












<212> DNA 














<213> Artificial Sequence 










<220> 














<223> PCR 


primer containing R. capsulatus DNA 






<400> 21 
atggcgaaga 


acgttgggat 


tttggctatg 


gatatctatt 


tccctcccac 


ctgtgttcaa 


60 


caggaagctt 


tggaagcaca 


tgatggagca 


agtaaaggga 


aatacactat 


tggacttggc 


120 


caagattgtt 


tagctttttg 


cactgagctt 


gaagatgtta 


tctctatgag 


tttcaatgcg 


180 


gtgacatcac 


tttttgagaa gtataagatt gaccctaacc aaatcgggcg. 


tcttgaagta 


240 


ggaagtgaga 


ctgttattga caaaagcaag tccatcaaga 


ccttcttgat 


gcagctcttt 


300 


gagaaatgtg 


gaaacactga 


tgtcgaaggt gttgactcga 


ccaatgcttg 


ctatggtgga 


360 


actgcagctt 


tgttaaactg tgtcaattgg gttgagagta actcttggga 


tggacgttat 


420 


ggcctcgtca 


tttgtactga 


cagcgcggtt tatgcagaag gacccgcaag 


gcccactgga 


480 


ggagctgcag 


cgattgctat 


gttgatagga 


cctgatgctc 


ctatcgtttt 


cgaaagcaaa 


540 


ttga'gagcaa 


gccacatggc 


tcatgtctat 


gacttttaca 


agcccaatct 


tgctagcgag 


600 
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tacccggttg ttgatggtaa gctttcacag acttgctacc tcatggctct tgactcctgc 660 

tataaacatt tatgcaacaa gttcgagaag atcgagggca aagagttctc cataaatgat 720 

gctgattaca ttgttttcca ttctccatac aataaacttg tacagaaaag ctttgctcgt 780 

ctcttgtaca acgacttctt gagaaacgca agctccattg acgaggctgc caaagaaaag 840 

ttcacccctt attcatcttt gacccttgac gagagttacc aaagccgtga tcttgaaaag 900 

gtgtcacaac aaatttcgaa accgttttat gatgctaaag tgcaaccaac gactttaata 960 

ccaaaggaag tcggtaacat gtacactgct tctctctacg ctgcatttgc ttccctcatc 1020 

cacaataaac acaatgattt ggcgggaaag cgggtggtta tgttctctta tggaagtggc 1080 

tccaccgcaa caatgttctc attacgcctc aacgacaata agcctccttt cagcatttca 1140 

aacattgcat ctgtaatgga tgttggcggt aaattgaaag ctagacatga gtatgcacct 1200 

gagaagtttg tggagacaat gaagctaatg gaacataggt atggagcaaa ggactttgtg 1260 

acaaccaagg agggtattat agatcttttg gcaccgggaa cttattatct gaaagaggtt 1320 

gattccttgt accggagatt ctatggcaag aaaggtgaag atggatctgt agccaatgga 1380 

cactga 1386 

<210> 22 
<211> 1779 
<212> DNA 

<213> Artificial Sequence 

<220> . 

<223> PCR primer containing Schizosaccharomyces pombe DNA 
<400> 22 

atggatctcc gtcggaggcc tcctaaacca ccggttacca acaacaacaa ctccaacgga 60 

tctttccgtt cttatcagcc. tcgcacttcc gatgacgatc atcgtcgccg ggctacaaca 120 

attgctcctc caccgaaagc atccgacgcg cttcctcttc cgttatatct cacaaacgcc 180 

gttttcttca cgctcttctt ctccgtcgcg tattacctcc tccaccggtg gcgtgacaag 240 

atccgttaca atacgcctct tcacgtcgtc actatcacag aactcggcgc cattattgct 300 

ctcatcgctt cgtttatcta tctcctaggg ttttttggta ttgactttgt tcagtcattt 360 

atctcacgtg cctctggtga tgcttgggat ctcgccgata cgatcgatga tgatgaccac 420 

cgccttgtca cgtgctctcc accgactccg atcgtttccg ttgctaaatt acctaatccg 480 

gaacctattg ttaccgaatc gcttcctgag gaagacgagg agattgtgaa atcggttatc 540 

gacggagtta ttccatcgta ctcgcttgaa tctcgtctcg gtgattgcaa aagagcggcg 600 
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tcgattcgtc gtgaggcgtt gcagagagtc accgggagat cgattgaagg gttaccgttg 660 

gatggatttg attatgaatc gattttgggg caatgctgtg agatgcctgt tggatacatt 720 

cagattcctg ttgggattgc tggtccattg ttgcttgatg gttatgagta ctctgttcct 780 

atggctacaa ccgaaggttg tttggftgct agcactaaca gaggctgcaa ggctatgttt 84 0 

atctctggtg gcgccaccag taccgttctt aaggacggta tgacccgagc acctgttgtt 900 

cggttcgctt cggcgagacg agcttcggag cttaagtttt tcttggagaa tccagagaac 960 

tttgatactt tggcagtagt cttcaacagg tcgagtagat ttgcaagact gcaaagtgtt 1020 

aaatgcacaa tcgcggggaa gaatgcttat gtaaggttct gttgtagtac tggtgatgct 1080 

atggggatga atatggtttc taaaggtgtg cagaatgttc ttgagtatct taccgatgat 114 0 

ttccctgaca tggatgtgat tggaatctct ggtaacttct gttcggacaa gaaacctgct 1200 

gctgtgaact ggattgaggg acgtggtaaa tcagttgttt gcgaggctgt aatcagagga 1260 

gagatcgtga acaaggtctt gaaaacgagc gtggctgctt tagtcgagct caacatgctc 1320 

aagaacctag ctggctctgc tgttgcaggc tctctaggtg gattcaacgc tcatgccagt 1380 

aacatagtgt ctgctgtatt catagctact ggccaagatc cagctcaaaa cgtggagagt 14 4 0 

tctcaatgca tcaccatgat ggaagctatt aatgacggca aagatatcca tatctcagtc 1500 

actatgccat ctatcgaggt ggggacagtg ggaggaggaa cacagcttgc atctcaatca 1560 

gcgtgtttaa acctgctcgg agttaaagga gcaagcacag agtcgccggg aatgaacgca 1620 

aggaggctag cgacgatcgt agccggagca gttttagctg gagagttatc tttaatgtca 1680 

gcaattgcag ctggacagct tgtgagaagt cacatgaaat acaatagatc cagccgagac 174 0 

atctctggag caacgacaac gacaacaaca acaacatga 177 9 

<210> 23 
<211> 684 
<212> DNA 

<213> Artificial Sequence 1 ' 
<220> 

<223> PCR primer containing S. pombe DNA 
<400> 23 

atgagttccc aacaagagaa aaaggattat gatgaagaac aattaaggtt gatggaagaa 60 

gtttgtatcg ttgtagatga aaatgatgtc cctttaagat atggaacgaa aaaggagtgt 120 

catttgatgg aaaatataaa taaaggtctt ttgcatagag cattctctat gttcatcttt 180 

gatgagcaaa atcgcctttt acttcagcag cgtgcagaag agaaaattac atttccatcc 24 0 
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ttatggacga 


atacatgttg ctcccaccca ttggatgttg ctggtgaacg tggtaatact 


300 


ttacctgaag 


ctgttgaagg tgttaagaat gcagctcaac gcaagctgtt ccatgaattg 


360 


ggtattcaag 


ccaagtatat tcccaaagac aaatttcagt ttcttacacg aatccattac 


420 


cttgctccta 


gtactggtgc ttggggagag catgaaattg actacattct tttcttcaaa 


480 


ggtaaagttg 


agctggatat caatcccaat gaagttcaag cctataagta tgttactatg 


540 


gaagagttaa 


aagagatgtt ttccgatcct caatatggat tcacaccatg gttcaaactt 


600 


atttgtgagc 


attttatgtt taaatggtgg caggatgtag atcatgcgtc aaaattccaa 


660 


gataccttaa 


ttcatcgttg ctaa 


684 



<210> 24 

<211> 531 

<212> DNA 

<213> Artificial Sequence 
<220> 



<223> PCR 


primer containing Streptomyces sp CL190 DNA 




<400> 24 






atgagtgagc 


ttatacccgc ctgggttggt gacagactgg ctccggtgga caagttggag 


60 


gtgcatttga 


aagggctccg ccacaaggcg gtgtctgttt tcgtcatgga tggcgaaaac 


120 


gtgctgatcc 


agcgccgctc ggaggagaaa tatcactctc ccgggctttg ggcgaacacc 


180 


tgctgcaccc 


atccgggctg gaccgaacgc cccgaggaat gcgcggtgcg gcggctgcgc 


240 


gaggagctgg 


ggatcaccgg gctttatccc gcccatgccg accggctgga atatcgcgcc 


300 


gatgtcggcg 


gcggcatgat cgagcatgag gtggtcgaca tctatctggc ctatgccaaa 


360 


ccgcatatgc 


ggatcacccc cgatccgcgc gaagtggccg aggtgcgctg gatcggcctt 


420 


tacgatctgg 


cggccgaggc cggtcggcat cccgagcggt tctcgaaatg gctcaacatc 


480 


tatctgtcga 


gccatcttga ccggattttc ggatcgatcc tgcgcggctg a 


531 



<210> 25 

<211> 65 

<212> DNA 

<213> Artificial Sequence . 
<220> 

<223> PCR primer containing Streptomyces sp CL190 DNA 
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<400> 25 

ggggtaccgc ggccgcacgc gtctatgcac caacctttgc ggtcttgttg tcgcgttcca 60 
gctgg 65 

<210> 26 
<211> 60 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Oligonucleotide containing S. cerevisiae DNA 



<400> 26 

gagctccacc gcggcggccg cgtcgactac ggccgcagga ggagttcata tgtcagagtt 60 

<210> 27 

<211> 60 

<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Oligonucleotide containing S. cerevisiae DNA 



<400> 27 

tctaccaaag gaagaggagt tttaactcga gtaggaggca catatgtctc agaacgttta 60 

<210> 28 

<211> 60 

<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Oligonucleotide containing Streptomyces sp CL190 and 
R. capsulatus DNA 



<400> 28 

caagaccgca aaggttggtg catagacgcg gtaaggaggc acatatgagt gagcttatac 



<210> 29 
<211> 60 



60 
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<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Oligonucleotide containing R. capsulatus DNA 
<400> 29 

cctgcgcggc tgagcggccg cggatccgat cgcgtgcggc cgcggtaccc aattcgccct 60 

<210> 30 
<211> 60 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Oligonucleotide containing Streptomyces sp CL190 
and S. cerevisiae DNA 

<400> 30 

tgtcattgaa aagatatgag gatcctctag gtacttccct ggcgtgtgca gcggttgacg 60 

<210> 31 

<211> 60 

<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Oligonucleotide containing Streptomyces sp CL190 DNA 

<400> 31 

cgattccgca ttatcggtac gggtgcctac ctagaactag tggatccccc gggctgcagg 60 

<210> 32 

<211> 60 

<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Oligonucleotide containing N. tabacum and S.- cerevisiae DNA 
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<400> 32 

ctttcctgaa acataattta taatcagatc ggccgcagga ggagttcata tgtcagagtt 60 

<210> 33 

<211> 60 

<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Oligonucleotide containing N. tabacum and R. capsulatus DNA 

<400> 33 

ttcggatcga tcctgcgcgg ctgagcggcc gatctaaaca aacccggaac agaccgttgg 60 

<210> 34 

<211> 59 

<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Oligonucleotide containing N. tabacum and S. cerevisiae DNA 

<400> 34 

ctttcctgaa acataattta taatcagatc ggccgcagga ggagttcata tgtcagagt 59 

<210> 35 

<211> 60 

<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Oligonucleotide containing N. tabacum and S. pombe DNA 

<400> 35 , 

tcgttgctaa ggatcccccg ggatccggcc gatctaaaca aacccggaac agaccgttgg 60 

<210> 36 
<211> 13 
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<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Oligonucleotide containing NotI restriction site 
<400> 36 

catggcggcc g'cg . 13 

<210> 37 

<211> 13 

<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Oligonucleotide containing NotI restriction site 

<400> 37 

gatccgcggc cgc 13 

<210> 38 

<211> 60 

<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Oligonucleotide containing S. cerevisiae DNA 

<400> 38 ' 

ttaaataagg aggaataaac catggcggcc gcaggaggag ttcatatgtc agagttgaga 60 

I 

<210> 39 

<211> 60 

<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Oligonucleotide containing A. thaliana DNA 
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<400> 39 

aacaacaaca acatgacccg ggatccggcc gcgatccgag ctcgagatct gcagctggta 



60 



<210> 



40 



<211> 



60 



<212> 



DNA 



<213> 



Artificial Sequence 



<220> 



<223> 



Oligonucleotide containing S. cerevisiae DNA 



<400> 40 

tcgattaaat aaggaggaat aaaccatggc ggccgcagga ggagttcata tgtcagagtt 



60 



<210> 41 

<211> 60 

<212> DNA 

. <213> Artificial Sequence 
<220> 

<223> Oligonucleotide containing R. capsulatus DNA 

<400> 41 

<gattttcgga tcgatcctgc gcggctgagc ggccgcgatc cgagctcgag atctgcagct 60 

<210> 42 

<211> 60 

<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Oligonucleotide containing S. cerevisiae DNA 

<400> 42 

tcgattaaat aaggaggaat aaaccatggc ggccgcagga ggagttcata tgtcagagtt 60 

<210> 43 

<211> 60 

<212> DNA 
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<213> Artificial Sequence 
<220> 

<223> Oligonucleotide containing S. pombB DNA 
<400> 43 

ttcatcgttg' ctaaggatcc cccgggatcc ggccgcgatc cgagctcgag atctgcagct 60 

<210> 44 

<211> 61 

<212> DNA 

<213> Artificial Sequence 

<220> , 

<223> Oligonucleotide containing R. capsulatus DNA 



<400> 44 

ttaaataagg aggaataaac catggcggcc gtaaggaggc acatatgagt gagcttatac 60 
t • 61 

<210> 45 

<211> 61 

<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Oligonucleotide containing R. capsulatus DNA 

<400> 45 

gcctgcgcgg ctgagcggcc gcggatccga tggccgcgat ccgagctcga gatctgcagc 60 
t 61 

<210> 46 

<211> 60 

<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Oligonucleotide containing S. pombe DNA 
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<400> 46 

ttaaataagg aggaataaac catggcggcc gtaggaggca catatgagtt cccaacaaga 60 
<210> 47 

<211> 60 w 
<212> DNA 

<213> Artificial Sequence 

<220> , 
<223> Oligonucleotide containing S. pombe DNA 
<400> 47 

accttaattc atcgttgcta aggatccccc ggccgcgatc cgagctcgag atctgcagct 60 

<210> 48 
<211> 1356 
<212> DNA 

<213> Saccharomyces cerevisiae 
<400> 48 

atgtcagagt tgagagcctt cagtgcccca gggaaagcgt tactagctgg tggatattta 60 
gttttagata caaaatatga agcatttgta gtcggattat cggcaagaat gcatgctgta 120 
gcccatcctt acggttcatt gcaagggtct gataagtttg aagtgcgtgt gaaaagtaaa - 180- 
caatttaaag atggggagtg gctgtaccat ataagtccta aaagtggctt cattcctgtt 24 0 
tcgataggcg gatctaagaa ccctttcatt gaaaaagtta tcgctaacgt atttagctac 300 
tttaaaccta acatggacga ctactgcaat agaaacttgt tcgttattga tattttctct 360 
gatgatgcct accattctca. ggaggatagc gttaccgaac atcgtggcaa cagaagattg 420 
agttttcatt cgcacagaat tgaagaagtt cccaaaacag ggctgggctc ctcggcaggt 480 
ttagtcacag ttttaactac agctttggcc tccttttttg tatcggacct ggaaaataat 540 
gtagacaaat atagagaagt tattcataat ttagcacaag ttgctcattg tcaagctcag 
ggtaaaattg gaagcgggtt tgatgtagcg gcggcagcat atggatctat cagatataga 
agattcccac ccgcattaat ctctaatttg ccagatattg gaagtgctac ttacggcagt 720 
aaactggcgc atttggttga tgaagaagac tggaatatta cgattaaaag taaccattta 780 
ccttcgggat taactttatg gatgggcgat attaagaatg gttcagaaac agtaaaactg 
gtccagaagg. taaaaaattg gtatgattcg catatgccag aaagcttgaa aatatataca . 



600 
660 



840 
900 
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gaactcgatc atgcaaattc tagatttatg 


gatggactat ctaaactaga tcgcttacac 


960 


gagactcatg 


acgattacag 


cgatcagata 


tttgagtctc ttgagaggaa tgactgtacc 


1020 


tgtcaaaagt 


atcctgaaat 


cacagaagtt 


agagatgcag ttgccacaat tagacgttcc 


1080 


tttagaaaaa 


taactaaaga 


atctggtgcc 


gatatcgaac ctcccgtaca aactagctta 


1140 


ttggatgatt 


gccagacctt 


aaaaggagtt 


cttacttgct taatacctgg tgctggtggt 


1200 


tatgacgcca ttgcagtgat tactaagcaa 


aatcfttaatc ttaaaactca aaccactaat 


1260 


gacaaaagat 


tttctaaggt 


tcaatggctg 


gatgtaactc aqoctcractcr aaatcrttacra 


1320 


aaagaaaaag atccggaaac ttatcttgat 


aaataa 


1356 


<210> 49 










<211> 1332 








<212> DNA 










<213> Saccharomyces cerevisiae 






<400> 49 
atgtcattac 


cgttcttaac 


ttctgcaccg 


ggaaaggtta ttatttttgg tgaacactct 


60 


gctgtgtaca 


acaagcctgc 


cgt'cgctgct 


agtgtgtctg cgttgagaac ctacctgcta 


120 


ataagcgagt 


catctgcacc 


agatactatt 


gaattggact tcccggacat tagctttaat 


180 


cataagtggt 


ccatcaatga 


tttcaatgcc 


atcaccgagg atcaagtaaa ctcccaaaaa 


240 


ttggccaagg 


ctcaacaagc 


caccgatggc 


ttgtctcagg aactcgttag tcttttggat 


300 


ccgttgttag 


ctcaactatc 


cgaatccttc 


cactaccatg cagcgttttg tttcctgtat 


360 


atgtttgttt 


gcctatgccc 


ccatgccaag 


aatattaagt tttctttaaa gtctacttta 


420 


cccatcggtg 


ctgggttggg 


ctcaagcgcc 


tctatttctg tatcactggc cttagctatg 


480 


gcctacttgg 


gggggttaat 


aggatctaat 


gacttggaaa agctgtcaga aaacgataag 


540 


catatagtga 


atcaatgggc 


cttcataggt 


gaaaagtgta ttcacggtac cccttcagga 


600 


atagataacg 


ctgtggccac 


ttatggtaat 


gccctgctat ttgaaaaaga ctcacataat 


660 


ggaacaataa 


acacaaacaa 


ttttaagttc 


ttagatgatt tcccagccat tccaatgatc 


720 


ctaacctata 


ctagaattcc 


aaggtctaca 


aaagatcttg ttgctcgcgt tcgtgtgttg 


780 


gtcaccgaga 


aatttcctga 


agttatgaag 


ccaattctag atgccatggg tgaatgtgcc 


840 


ctacaaggct 


tagagatcat 


gactaagtta 


agtaaatgta aaggcaccga tgacgaggct 


900 


gtagaaacta 


ataatgaact 


gtatgaacaa 


ctattggaat tgataagaat aaatcatgga 


960 


ctgcttgtct 


caatcggtgt 


ttctcatcct 


ggattagaac ttattaaaaa tctgagcgat 


1020 


gatttgagaa 


ttggctccac 


aaaacttacc 


ggtgctggtg gcggcggttg ctctttgact 


1080 
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ttgttacgaa gagacattac tcaagagcaa attgacagct tcaaaaagaa attgcaagat 1140. 

gattttagtt acgagacatt tgaaacagac ttgggtggga ctggctgctg tttgttaagc 1200 

gcaaaaaatt tgaataaaga tcttaaaatc aaatccctag tattccaatt atttgaaaat 1260 

aaaactacca caaagcaaca aattgacgat ctattattgc caggaaacac gaatttacca 1320 

tggacttcat aa . 1332 

<210> 50 

<211> 1191 

<212> DNA 

<213> Saccharomyces cerevisiae 



<400> 50 




rrrt* farrnra en cat' pa a ca t cacaaccct taaatattoo 


60 


ygga.cicid.ygg 


dUd^ycicty l. l 


rT?^^^-r■^-^mr , r , arraat't" rrrt" cratatcaat tiactttatca 


120 






rr^ rT^I - r 1 "! - nrrr rrn+*ar , i"rrpap chcracrtttcia acocoacact 


180 




a Ly y ay aauo 


anarflnr'af r cracaafcraaa craactcaaaa ttcrtctacGC 


240 


gacctacgcc 


aattaagaaa 


ggaaatggaa tcgaaggacg cctcattgcc cacattatct 


300 


caatggaaac 


tccacattgt 


ctccgaaaat aactttccta cagcagctgg tttagcttcc 


360 


tccgctgctg 


gctttgctgc 


attggtctct geaattgeta agttatacca attaccacag 


420 


tcaacttcag 


aaatatctag 


aatagcaaga aaggggtctg gttcagcttg tagatcgttg 


480 


tttggcggat 


acgtggcctg 


ggaaatggga aaagctgaag atggtcatga ttccatggca 


540 


gtacaaatcg 


cagacagctc 


tgactggcct cagatgaaag cttgtgtcct agttgtcagc 


600 


gatattaaaa 


aggatgtgag 


ttccactcag ggtatgcaat tgaccgtggc aacctccgaa 


660 


ctatttaaag 


aaagaattga 


acatgtcgta ccaaagagat ttgaagtcat gegtaaagee 


720 


attgttgaaa 


aagatttcgc 


cacctttgca aaggaaacaa tgatggattc caactctttc 


780 


catgccacat 


gtttggactc 


tttccctcca atattctaca tgaatgacac ttccaagcgt 


840 


atcatcagtt 


ggtgccacac 


cattaatcag ttttaeggag aaacaatcgt tgcatacacg 


900 


tttgatgcag 


gtccaaatgc 


tgtgttgtac tacttagctg aaaatgagtc' gaaactcttt 


960 


gcatttatct 


ataaattgtt 


tggctctgtt cctggatggg acaagaaatt tactactgag 


1020 


cagcttgagg 


ctttcaacca 


tcaatttgaa tcatctaact ttactgcacg tgaattggat 


1080 


cttgagttgc 


aaaaggatgt 


tgccagagtg attttaactc aagtcggttc aggcccacaa 


1140 


gaaacaaacg 


aatctttgat 


tgacgcaaag actggtctac caaaggaata a 


1191 
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<210> 51 
<211> 1197 • 
<212> DNA 

<213> Saccharomyces cerevisiae 



<400> . 51 

atgtctcaga acgtttacat tgtatcgact gccagaaccc caattggttc attccagggt 60 

tctctatcct ccaagacagc agtggaattg ggtgctgttg ctttaaaagg cgccttggct 120 

aaggttccag aattggatgc atccaaggat tttgacgaaa ttatttttgg taacgttctt 180 

tctgccaatt tgggccaagc tccggccaga caagttgctt tggctgccgg tttgagtaat 240 

catatcgttg caagcacagt taacaaggtc tgtgcatccg ctatgaaggc aatcattttg 300 

ggtgctcaat ccatcaaatg tggtaatgct gatgttgtcg tagctggtgg ttgtgaatct 360 

atgactaacg caccatacta catgccagca gcccgtgcgg gtgccaaatt tggccaaact 420 

gttcttgttg atggtgtcga aagagatggg ttgaacgatg cgtacgatgg tctagccatg 480 

ggtgtacacg eagaaaagtg tgcccgtgat tgggatatta ctagagaaca acaagacaat 540 

tttgccatcg aatcctacca aaaatctcaa aaatctcaaa aggaaggtaa attcgacaat 600 

gaaattgtac ctgttaccat taagggattt agaggtaagc ctgatactca agtcacgaag 660 

gacgaggaac ctgctagatt acacgttgaa aaattgagat ctgcaaggac tgttttccaa 720 

aaagaaaacg gtactgttac tgccgctaac gcttctccaa tcaacgatgg tgctgcagcc 780 

gtcatcttgg tttccgaaaa agttttgaag gaaaagaatt tgaagccttt ggctattatc 840 

aaaggttggg gtgaggccgc tcatcaacca gctgatttta catgggctcc atctcttgca 900 

gttccaaagg ctttgaaaca tgctggcatc gaagacatca attctgttga ttactttgaa 960 

ttcaatgaag ccttttcggt tgtcggtttg gtgaacacta agattttgaa gctagaccca 1020 

tctaaggtta atgtatatgg tggtgctgtt gctctaggtc acccattggg ttgttctggt 1080 

gctagagtgg ttgttacact gctatccatc ttacagcaag aaggaggtaa gatcggtgtt 1140 

gccgccattt gtaatggtgg tggtggtgct tcctctattg tcattgaaaa gatatga 1197 

<210> 52 
<211> 1386 
<212> DNA 

<213> Arabidopsis thaliana 



<400> 52 

BNSDOCID: <WO 0210398A2. I > 



WO 02/10398 



PCT/US01/24037 



26 



atggcgaaga 


acgttgggat tttggctatg 


gatatctatt 


tccctcccac ctgtgttcaa 


60 


caggaagctt 


tggaagcaca tgatggagca 


agtaaaggga 


aatacactat tggacttggc . 


120 


caagattgtt 


tagctttttg cactgagctt 


gaagatgtta 


tctctatgag tttcaatgcg 


180 


gtgacatcac 


tttttgagaa gtataagatt 


gaccctaacc 


aaatcgggcg tcttgaagta 


240 


ggaagtgaga 


ctgttattga caaaagcaag 


tccatcaaga 


ccttcttgat gcagctcttt 


300 


gagaaatgtg 


gaaacactga tgtcgaaggt 


gttgactcga 


ccaatgcttg ctatggtgga 


360 


actgcagctt 


tgttaaactg tgtcaattgg 


gttgagagta 


actcttggga tggacgttat 


420 


ggcctcgtca 


tttgtactga cagcgcggtt 


tatgcagaag 


gacccgcaag gcccactgga 


480 


ggagctgcag 


cgattgctat gttgatagga 


cctgatgctc 


ctatcgtttt cgaaagcaaa 


540 


ttgagagcaa 


gccacatggc tcatgtctat 


gacttttaca 


agcccaatct tgctagcgag ; 


600 


tacccggttg 


ttgatggtaa gctttcacag 


acttgctacc 


tcatggctct tgactcctgc 


660 


tataaacatt 


tatgcaacaa gttcgagaag 


atcgagggca aagagttctc cataaatgat 


720 


gctgattaca 


ttgttttcca ttctccatac 


aataaacttg 


tacagaaaag ctttgctcgt 


780 


ctcttgtaca 


acgacttctt gagaaacgca 


agctccattg acgaggctgc caaagaaaag 


840 


ttcacccctt 


attcatcttt gacccttgac 


gagagttacc aaagccgtga tcttgaaaag 


900 


gtgtcacaac 


aaatttcgaa accgttttat 


gatgctaaag 


tgcaaccaac gactttaata 


960 


ccaaaggaag 


tcggtaacat gtacactgct 


tctctctacg 


ctgcatttgc ttccctcatc 


1020 


cacaataaac 


acaatgattt ggcgggaaag 


cgggtggtta tgttctctta tggaagtggc 


1080 


tccaccgcaa 


caatgttctc attacgcctc 


aacgacaata 


agcctccttt cagcatttca 


1140 


aacattgcat 


ctgtaatgga tgttggcggt 


aaattgaaag ctagacatga gtatgcacct 


1200 


gagaagtttg 


tggagacaat gaagctaatg 


gaacataggt 


atggagcaaa ggactttgtg 


1260 


acaaccaagg 


agggtattat agatcttttg 


gcaccgggaa 


cttattatct gaaagaggtt 


1320 


gattccttgt 


accggagatt ctatggcaag 


aaaggtgaag 


atggatctgt agccaatgga 


1380 


cactga 








1386 



<210> 53 
<211> 1779 
<212> DNA 

<213> . Arabidopsis thaliana 
<400> 53 

atggatctcc gtcggaggcc tcctaaacca ccggttacca acaacaacaa ctccaacgga 60 
tctttccgtt cttatcagcc tcgcacttcc gatgacgatc atcgtcgccg ggctacaaca 120 
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attgctcctc 


caccgaaagc atccgacgcg 


cttcctcttc cgttatatct 


cacaaacgcc 


180 


gttttcttca 


cgctcttctt ctccgtcgcg 


tattacctcc tccaccggtg 


gcgtgacaag 


240 


atccgttaca 


atacgcctct tcacgtcgtc 


actatcacag aactcggcgc 


cattattgct 


300 


ctcatcgctt 


cgtttatcta tctcctaggg 


ttttttggta ttgactttgt 


tcagtcattt 


360 


atctcacgtg 


cctctggtga tgcttgggat 


ctcgccgata cgatcgatga 


tgatgaccac 


420 


cgccttgtca 


cgtgctctcc accgactccg 


atcgtttccg ttgctaaatt 


acctaatccg 


480 


gaacctattg 


ttaccgaatc gcttcctgag 


gaagacgagg agattgtgaa 


atcggttatc 


540 


gacggagtta 


ttccatcgta ctcgcttgaa 


tctcgtctcg gtgattgcaa 


aagagcggcg 


600 


tcgattcgtc 


gtgaggcgtt gcagagagtc 


accgggagat cgattgaagg 


gttaccgttg 


660 


gatggatttg 


attatgaatc gattttgggg 


caatgctgtg agatgcctgt 


tggatacatt 


720 


cagattcctg 


ttgggattgc tggtccattg 


ttgcttgatg gttatgagta 


ctctgttcct 


780 


atggctacaa 


ccgaaggttg tttggttgct 


agcactaaca gaggctgcaa 


ggctatgttt 


840 


atctctggtg 


gcgccaccag taccgttctt 


aaggacggta tgacccgagc 


acctgttgtt 


900 


cggttcgctt 


cggcgagacg agcttcggag 


cttaagtttt tcttggagaa 


tccagagaac 


960 


tttgatactt 


tggcagtagt cttcaacagg 


tcgagtagat ttgcaagact 


gcaaagtgtt 


1020 


aaatgcacaa 


tcgcggggaa gaatgcttat 


gtaaggttct gttgtagtac 


tggtgatgct 


1080 


atggggatga 


atatggtttc taaaggtgtg 


cagaatgttc ttgagtatct 


taccgatgat 


1140 


ttccctgaca 


tggatgtgat tggaatctct 


ggtaacttct gttcggacaa 


gaaacctgct 


1200 


gctgtgaact 


ggattgaggg acgtggtaaa 


tcagttgttt gcgaggctgt 


aatcagagga 


1260 


gagatcgtga 


acaaggtctt gaaaacgagc 


gtggctgctt tagtcgagct 


caacatgctc 


1320 


aagaacctag 


ctggctctgc tgttgcaggc 


tctctaggtg gattcaacgc 


tcatgccagt 


1380 


aacatagtgt 


ctgctgtatt catagctact 


ggccaagatc cagctcaaaa 


cgtggagagt 


1440 


tctcaatgca 


tcaccatgat ggaagctatt 


aatgacggca aagatatcca 


tatctcagtc 


1500 


actatgccat 


ctatcgaggt ggggacagtg 


ggaggaggaa cacagcttgc 


atctcaatca 


1560 


gcgtgtttaa 


acctgctcgg agttaaagga 


gcaagcacag agtcgccggg 


aatgaacgca 


1620 


aggaggctag 


cgacgatcgt agccggagca 


gttttagctg gagagttatc 


tttaatgtca 


1680 


gcaattgcag 


ctggacagct tgtgagaagt 


cacatgaaat acaatagatc. 


cagccgagac 


1740 


atctctggag 


caacgacaac gacaacaaca 


acaacatga 




1779 



<210> 54 
<211> 684 
<212> DNA 
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<213> Artificial Sequence 
<220> 

<223> Schizosaccharomyces pombe IDI1 (IPP isomerase) 



<400> 54 
atgagttccc" 


aacaagagaa 


aaaggattat gatgaagaac 


aattaaggtt 


gatggaagaa 


60 


gtttgtatcg 


ttgtagatga 


aaatgatgtc cctttaagat 


atggaacgaa 


aaaggagtgt 


120 


catttgatgg 


aaaatataaa 


taaaggtctt ttgcatagag 


cattctctat 


gttcatcttt 


180 


gatgagcaaa 


atcgcctttt 


acttcagcag cgtgcagaag agaaaattac 


atttccatcc 


240 


ttatggacga 


atacatgttg 


ctcccaccca ttggatgttg 


ctggtgaacg tggtaatact 


300 


ttacctgaag 


ctgttgaagg 


tgttaagaat gcagctcaac 


gcaagctgtt 


ccatgaattg 


360 


ggtattcaag 


ccaagtatat 


tcccaaagac aaatttcagt 


ttcttacacg 


aatccattac 


420 


cttgctccta 


gtactggtgc 


ttggggagag catgaaattg 


actacattct 


tttcttcaaa 


480 


ggtaaagttg 


agctggatat 


caatcccaat gaagttcaag 


cctataagta 


tgttactatg 


540 


gaagagttaa 


aagagatgtt 


ttccgatcct caatatggat 


tcacaccatg 


gttcaaactt 


600 


atttgtgagc 


attttatgtt 


taaatggtgg caggatgtag 


atcatgcgtc aaaattccaa 


660 


gataccttaa 


ttcatcgttg 


ctaa 






684 



<210> 55 
<211> 531 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Rhodobacter capsulatus idiB (IPP isomerase) 
<400> 55 

atgagtgagc ttatacccgc ctgggttggt gacagactgg ctccggtgga caagttggag 60 
gtgcatttga aagggctccg ccacaaggcg gtgtctgttt tcgtcatgga tggcgaaaac 120 
gtgctgatcc agcgccgctc ggaggagaaa tatcactctc ccgggctttg ggcgaacacc 180 
tgctgcaccc atccgggctg gaccgaacgc cccgaggaat gcgcggtgcg gcggctgcgc 240 
gaggagctgg ggatcaccgg gctttatccc gcccatgccg accggctgga atatcgcgcc 300 
gatgtcggcg gcggcatgat cgagcatgag gtggtcgaca tctatctggc ctatgccaaa 360 
ccgcatatgc ggatcacccc cgatccgcgc gaagtggccg aggtgcgctg gatcggcctt 420 
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tacgatctgg cggccgaggc cggtcggcat cccgagcggt tctcgaaatg gctcaacatc 4 80 
tatctgtcga gccatcttga ccggattttc ggatcgatcc tgcgcggctg a 531 

<210> 56 
<211> 1059 
<212> DNA 

<213> Streptomyces sp. * 
<400> 56 

atgacggaaa cgcacgccat agccggggtc ccgatgaggt gggtgggacc ccttcgtatt 60 

tccgggaacg tcgccgagac cgagacccag gtcccgctcg ccacgtacga gtcgccgctg 120 

tggccgtcgg tgggccgcgg ggcgaaggtc tcccggctga cggagaaggg catcgtcgcc 180 

accctcgtcg acgagcggat gacccgctcg gtgatcgtcg aggcgacgga cgcgcagacc 240 

gcgtacatgg ccgcgcagac catccacgcc cgcatcgacg agctgcgcga ggtggtgcgc 300 

ggctgcagcc ggttcgccca gctgatcaac atcaagcacg agatcaacgc gaacctgctg 360 

ttcatccggt' tcgagttcac caccggtgac gcctccggcc acaacatggc cacgctcgcc 420 

tccgatgtgc tcctggggca cctgctggag acgatccctg gcatctccta cggctcgatc 480 

tccggcaact actgcacgga caagaaggcc accgcgatca acggcatcct cggccgcggc 540 

aagaacgtga* itcaccgagct gctggtgccg cgggacgtcg tcgagaacaa cctgcacacc 600 

;.i ■ 
acggctgcca agatcgtcga gctgaacatc cgcaagaacc tgctcggcac cctgctcgcc 660 

.', i ' ' 

ggcggcatcc gctcggccaa cgcccacttc gcgaacatgc tgctcggctt ctacctggcc 720 

'i 

accggccagg acgccgccaa catcgtcgag ggctcgcagg gcgtcgtcat ggccgaggac 780 

cgcgacggcg acctctactt cgcctgcacc ctgccgaacc tgatcgtcgg cacggtcggc 840 

aacggcaagg gtctcggctt cgtggagacg aacctcgccc ggctcggctg ccgagccgac 900 

cgcgaacccg gggagaacgc ccgccgcctc gccgtcatcg cggcagcgac cgtgctgtgc 960 

ggtgaactct cgctgctcgc ggcacagacg aacccgggcg aactcatgcg cgcgcacgtc 1020 

cagctggaac gcgacaacaa gaccgcaaag gttggtgca 1059 

<210> 57 * 

<211> 6798 

<212> DNA 

<213> Artificial Sequence 
<220> 
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<223> Streptomyces sp CL190 gene cluster containing mevalonate pathway 
and IPP isomerase orfs 



<400> 57 
tacgtacttc 


cctggcgtgt 


gcagcggttg 


acgcgccgtg 


ccctcgctgc 


gageggegeg 


60 


cacatctgac 


gtcctgcttt 


attgctttct 


cagaactegg 


gaegaagega 


tcccatgatc 


120 


acgcgatctc 


catgcagaaa 


agacaaaggg 


agctgagtgc 


gttgacacta 


ccgacctcgg 


180 


ctgagggggt 


atcagaaagc 


caccgggccc 


geteggtegg 


catcggfcgc 


gcccacgcca 


240 


aggccatcct 


gctgggagag 


catgcggtcg 


tetaeggage 


gccggcactc 


gctctgccga 


300 


ttccgcagct 


cacggtcacg 


gccagcgtcg 


gctggtcgtc 


cgaggcctcc 


gaeagtgegg 


360 


gtggcctgtc 


ctacacgatg 


accggtacgc 


cgtcgcgggc 


actggtgacg 


caggcctccg 


420 


acggcctgca 


ccggctcacc 


gcggaattca 


tggegeggat 


gggcgtgacg 


aacgcgccgc 


480 


acctcgacgt 


gatcctggac 


ggcgcgatcc 


cgcacggccg 


gggtctegge 


tccagcgcgg 


540 


ccggctcacg 


cgcgatcgcc 


ttggccctcg 


ccgacctctt 


cggccacgaa 


ctggccgagc 


600 


acacggcgta 


cgaactggtg 


cagacggccg 


agaacatggc 


gcacggccgg 


gccagcggcg 


660 


tggacgcgat 


gacggtcggc 


gcgtcccggc 


cgctgctgtt 


ccagcagggc 


cgcaccgagc 


.720 


gactggccat 


cggctgcgac 


agcctgttca 


tcgtcgccga 


cagcggcgtc 


ccgggcagca 


780 


ccaaggaagc 


ggtcgagatg 


ctgcgggagg 


gattcacccg 


cagcgccgga 


acacaggagc 


840 


ggttcgtcgg 


ccgggcgacg 


gaactgaccg 


aggccgcccg 


gcaggccctc 


gccgacggcc 


900 


ggcccgagga 


gctgggctcg 


cagctgacgt 


actaccacga 


gctgctccat 


gaggcccgcc 


960 


tgagcaccga 


cggcatcgat 


gcgctggtcg 


aggccgcgct 


gaaggcaggc 


agecteggag 


1020 


ccaagatcac 


cggcggtggt 


ctgggcggct 


geatgatege 


acaggcccgg 


cccgaacagg 


1080 


cccgggaggt 


cacccggcag 


ctccacgagg 


ccggtgccgt 


acagacctgg 


gtcgtaccgc 


1140 


tgaaagggct 


cgacaaccat 


gcgcagtgaa 


cacccgacca 


cgaccgtgct 


ccagtcgcgg 


1200 


gagcagggca 


gcgcggccgg 


cgccaccgcg 


gtcgcgcacc 


caaacatcgc 


gctgatcaag 


1260 






y y v — l y a w 


ct erceefcerca 


ccaeea erect 


gtcgatgacg 


1320 


ctggacgtct 


tccccacgac 


caccgaggtc 


cggctcgacc 


ccgccgccga 


gcacgacacg 


1380 


gccgccctca 


acggcgaggt 


ggccacgggc 


gagaegctge 


gccgcatcag 


cgccttcctc 


1440 


tccctggtgc 


gggaggtggc 


gggcagegae 


cagcgggccg 


tggtggacac 


ccgcaacacc 


1500 


gtgcccaccg 


gggcgggcct 


ggcgtcctcc 


gecagegggt 


tcgccgccct 


cgccgtcgcg 


1560 


gccgcggccg 


cctacgggct 


cgaactcgac 


gaccgcgggc 


tgtcccggct 


ggcccgacgt 


1620 


ggatccggct 


ccgcctcgcg 


gtcgatcttc 


ggeggctteg 


ccgtctggca 


cgccggcccc 


1680 


gacggcacgg 


ccacggaagc 


ggacctcggc 


tcctacgccg 


agccggtgcc 


cgcggccgac 


1740 
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ctcgacccgg 


cgctggtcat 


cgccgtggtc 


aacgccggcc 


ccaagcccgt 


ctccagccgc 


1800 


gaggccatgc 


gccgcaccgt 


cgacacctcg 


ccgctgtacc 


ggccgtgggc 


cgactccagt 


1860 


aaggacgacc 


tggacgagat 


gcgctcggcg 


ctgctgcgcg 


gcgacctcga 


ggccgtgggc 


1920 


gagatcgcgg 


agcgcaacgc 


gctcggcatg 


cacgccacca 


tgctggccgc 


ccgccccgcg 


1980 


gtgcggtacc 


tgtcgccggc 


cacggtcacc 


gtgctcgaca 


gcgtgctcca 


gctccgcaag 


2040 


gacggtgtcc 


tggcctacgc 


gaccatggac 


gccggtccca 


acgtgaaggt 


gctgtgccgg 


2100 


cgggcggacg 


ccgagcgggt 


ggccgacgtc 


gtacgcgccg 


ccgcgtccgg 


cggtcaggtc 


2160 


ctcgtcgccg 


ggccgggaga 


cggtgcccgc 


ctgctgagcg 


agggcgcatg 


acgacaggtc 


2220 


agcgcacgat 


cgtccggcac 


gcgccgggca 


agctgttcgt 


cgcgggcgag 


tacgcggtcg 


2280 


tggatccggg 


caacccggcg 


atcctggtag 


cggtcgaccg 


gcacatcagc 


gtcaccgtgt 


2340 


ccgacgccga 


cgcggacacc 


ggggccgccg 


acgtcgtgat 


ctcctccgac 


ctcggtccgc 


2400 


aggcggtcgg 


ctggcgctgg 


cacgacggcc 


ggctcgtcgt 


ccgcgacccg 


gacgacgggc 


2460 


agcaggcgcg 


cagcgccctg 


gcccacgtgg 


tgtcggcgat 


cgagaccgtg 


ggccggctgc 


2520 


tgggcgaacg 


cggacagaag 


gtccccgctc 


tcaccctctc 


cgtcagcagc 


cgcctgcacg 


2580 


aggacggccg 


gaagttcggc 


ctgggctcca 


gcggcgcggt 


gaccgtggcg 


accgtagccg 


2640 


ccgtcgccgc 


gttctgcgga 


ctcgaactgt 


ccaccgacga 


acggttccgg 


ctggccatgc 


2700 


tcgccaccgc 


ggaactcgac 


cccaagggct 


ccggcgggga 


cctcgccgcc 


agcacctggg 


2760 


gcggctggat 


cgcctaccag 


gcgcccgacc 


gggcctttgt 


gctcgacctg 


gcccggcgcg 


2820 


tgggagtcga 


ccggacactg 


aaggcgccct 


ggccggggca 


ctcggtgcgc 


cgactgccgg 


2880 


cgcccaaggg 


cctcaccctg 


gaggtcggct 


ggaccggaga 


gcccgcctcc 


accgcgtccc 


2940 


tggtgtccga 


tctgcaccgc 


cgcacctggc 


ggggcagcgc 


ctcccaccag 


aggttcgtcg 


3000 


agaccacgac 


cgactgtgtc 


cgctccgcgg 


tcaccgccct 


ggagtccggc 


gacgacacga 


3060 


gcctgctgca 


cgagatccgc 


cgggcccgcc 


aggagctggc 


ccgcctggac 


gacgaggtcg 


3120 


gcctcggcat 


cttcacaccc 


aagctgacgg 


cgctgtgcga 


cgccgccgaa 


gccgtcggcg 


3180 


gcgcggccaa 


gccctccggg 


gcaggcggcg 


gcgactgcgg 


catcgccctg 


ctggacgccg 


3240 


aggcgtcgcg 


ggacatcaca 


catgtacggc 


aacggtggga 


gacagccggg 


gtgctgcccc 


3300 


tgcccctgac 


tcctgccctg 


gaagggatct 


aagaatgacc 


agcgcccaac 


gcaaggacga 


3360 


ccacgtacgg 


ctcgccatcg 


agcagcacaa 


cgcccacagc 


ggacgcaacc 


agttcgacga 


3420 


cgtgtcgttc 


gtccaccacg 


ccctggccgg 


catcgaccgg 


ccggacgtgt 


ccctggccac 


3480 


gtccttcgcc 


gggatctcct 


ggcaggtgcc 


gatctacatc 


aacgcgatga 


ccggcggcag 


3540 


cgagaagacc 


ggcctcatca 


accgggacct 


ggccaccgcc 


gcccgcgaga 


ccggcgtccc 


3600 


catcgcgtcc 


gggtccatga 


acgcgtacat 


caaggacccc 


tcctgcgccg 


acacgttccg 


3660 
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tgtgctgcgc 


gacgagaacc ccaacgggtt 


cgtcatcgcg aacatcaacg 


ccaccacgac 


3720 


ggtcgacaac 


gcgcagcgcg cgatcgacct 


gatcgaggcg aacgccctgc 


agatccacat 


3780 


caacacggcg 


caggagacgc cgatgccgga 


gggcgaccgg tcgttcgcgt 


cctgggtccc 


3840 


gcagatcgag 


aagatcgcgg cggccgtcga 


catccccgtg atcgtcaagg 


aggtcggcaa 


3900 


cggcctgagc 


cggcagacca tcctgctgct 


cgccgacctc ggcgtgcagg 


cggcggacgt 


3960 


cagcggccgc 


ggcggcacgg acttcgcccg 


catcgagaac ggccgccggg 


agctcggcga 


4020 


ctacgcgttc 


ctgcacggct gggggcagtc 


caccgccgcc tgcctgctgg 


acgcccagga 


4080 


catctccctg 


cccgtcctcg cctccggcgg 


tgtgcgtcac ccgctcgacg 


tggtccgcgc 


4140 


cctcgcgctc 


ggcgcccgcg ccgtcggctc 


ctccgccggc ttcctgcgca 


ccctgatgga 


4200 


cgacggcgtc 


gacgcgctga tcacgaagct 


cacgacctgg ctggaccagc 


tggcggcgct 


4260 


gcagaccatg 


ctcggcgcgc gcaccccggc 


cgacctcacc cgctgcgacg 


tgctgctcca 


4320 


cggcgagctg 


cgtgacttct gcgccgaccg 


gggcatcgac acgcgccgcc 


tcgcccagcg 


4380 


ctccagctcc 


atcgaggccc tccagacgac 


gggaagcaca cgatgacgga 


aacgcacgcc 


4440 


atagccgggg 


tcccgatgag gtgggtggga 


ccccttcgta tttccgggaa 


cgtcgccgag 


4500 


accgagaccc 


aggtcccgct cgccacgtac 


gagtcgccgc tgtggccgtc 


ggtgggccgc 


4560. 


ggggcgaagg 


tctcccggct gacggagaag 


ggcatcgtcg ccaccctcgt 


cgacgagcgg 


4620 


atgacccgct 


cggtgatcgt cgaggcgacg 


gacgcgcaga ccgcgtacat 


ggccgcgcag 


4680 


accatccacg 


cccgcatcga cgagctgcgc 


gaggtggtgc gcggctgcag 


ccggttcgcc 


4740 


cagctgatca 


acatcaagca cgagatcaac 


gcgaacctgc tgttcatccg 


gttcgagttc 


4800 


accaccggtg 


acgcctccgg ccacaacatg 


gccacgctcg cctccgatgt 


gctcctgggg 


. 4860 


cacctgctgg 


agacgatccc tggcatctcc 


tacggctcga tctccggcaa 


ctactgcacg 


4920 


gacaagaagg 


ccaccgcgat caacggcatc 


ctcggccgcg gcaagaacgt 


gatcaccgag 


4 980' 


ctgctggtgc 


cgcgggacgt cgtcgagaac 


aacctgcaca ccacggctgc 


caagatcgtc 


5040 


gagctgaaca 


tccgcaagaa cctgctcggc 


accctgctcg ccggcggcat 


ccgctcggcc 


5100 


aacgcccact 


tcgcgaacat gctgctcggc 


ttctacctgg ccaccggcca 


ggacgccgcc 


5160 


aacatcgtcg 


agggctcgca gggcgtcgtc 


atggccgagg accgcgacgg 


cgacctctac 


5220 


ttcgcctgca 


ccctgccgaa cctgatcgtc 


ggcacggtcg gcaacggcaa 


gggtctcggc 


5280 


ttcgtggaga 


cgaacctcgc ccggctcggc 


tgccgagccg accgcgaacc 


cggggagaac 


5340 


gcccgccgcc 


tcgccgtcat cgcggcagcg 


accgtgctgt gcggtgaact 


ctcgctgctc 


5400 


gcggcacaga 


cgaacccggg cgaactcatg 


cgcgcgcacg tccagctgga 


acgcgacaac ... 


54 60. 


aagaccgcaa 


aggttggtgc atagggcatg 


tccatctcca taggcattca 


cgacctgtcg 


5520 


ttcgccacaa 


ccgagttcgt cctgccgcac 


acggcgctcg ccgagtacaa 


cggcaccgag 


5580 



Dworwin. 
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atcggcaagt 


accacgtcgg 


catcggccag cagtcgatga gcgtgccggc 


cgccgacgag 


5640 


gacatcgtga 


ccatggccgc 


gaccgcggcg cggcccatca tcgagcgcaa 


cggcaagagc 


5700 


cggatccgca 


cggtcgtgtt 


cgccacggag tcgtcgatcg accaggcgaa 


ggcgggcggc 


5760 


gtgtacgtgc 


actccctgct 


ggggctggag tcggcctgcc gggtcgtcga 


gctgaagcag 


5820 


gcctgctacg gggccaccgc 


cgcccttcag ttcgccatcg gcctggtgcg gcgcgacccc 


5880 


gcccagcagg 


tcctggtcat 


cgccagtgac gtctccaagt acgagctgga 


cagccccggc 


5940 


gaggcgaccc 


agggcgcggc 


cgcggtggcc atgctggtcg gcgccgaccc ggccctgctg 


6000 


cgtatcgagg 


agccgtcggg 


cctgttcacc gccgacgtca tggacttctg 


gcggcccaac 


6060 


tacctcacca 


ccgctctggt 


cgacggccag gagtccatca acgcctacct 


gcaggccgtc 


6120 


gagggcgcct 


ggaaggacta 


cgcggagcag gacggccggt cgctggagga 


gttcgcggcg 


6180 


ttcgtctacc 


accagccgtt 


cacgaagatg gcctacaagg cgcaccgcca 


cctgctgaac 


6240 


ttcaacggct 


acgacaccga 


caaggacgcc atcgagggcg ccctcggcca 


gacgacggcg 


6300 


tacaacaacg tcatcggcaa 


cagctacacc gcgtcggtgt acctgggcct 


ggccgccctg 


6360 


ctcgaccagg cggacgacct 


gacgggccgt tccatcggct tcctgagcta 


cggctcgggc 


6420 


agcgtcgccg agttcttctc 


gggcaccgtc gtcgccgggt accgcgagcg tctgcgcacc 


6480 


gaggcgaacc 


aggaggcgat 


cgcccggcgc aagagcgtcg actacgccac 


ctaccgcgag 


6540 


ctgcacgagt 


acacgctccc 


gtccgacggc ggcgaccacg ccaccccggt 


gcagaccacc 


6600 


ggccccttcc 


ggctggccgg 


gatcaacgac cacaagcgca tctacgaggc 


gcgctagcga 


6660 


cacccctcgg 


caacggggtg 


cgccactgtt cggcgcaccc cgtgccgggc 


tttcgcacag 


6720 


ctattcacga 


ccatttgagg 


ggcgggcagc cgcatgaccg acgtccgatt 


ccgcattatc 


6780 


ggtacgggtg 


cctacgta 






6798 



<210> 58 
<211> 7693 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Operon containing A. thaliana and S. cerevisiae DNA 
<400> 58 

ggccgcgtcg acgccggcgg aggcacatat gtctcagaac gtttacattg tatcgactgc 60 
cagaacccca attggttcat tccagggttc tctatcctcc aagacagcag tggaattggg 120 
tgctgttgct ttaaaaggcg ccttggctaa ggttccagaa ttggatgcat ccaaggattt 180 
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tgacgaaatt atttttggta acgttctttc tgccaatttg ggccaagctc cggccagaca 24 0 
agttgctttg gctgccggtt tgagtaatca tatcgttgca agcacagtta acaaggtctg 300 

tgcatccgct atgaaggcaa tcattttggg tgctcaatcc atcaaatgtg gtaatgctga 360 

tgttgtcgta gctggtggtt gtgaatctat gactaacgca ccatactaca tgccagcagc 420 

ccgtgcgggt gccaaatttg gccaaactgt tcttgttgat ggtgtcgaaa gagatgggtt 4 80 

gaacgatgcg tacgatggtc tagccatggg tgtacacgca gaaaagtgtg cccgtgattg 540 

ggatattact agagaacaac aagacaattt tgccatcgaa tcctaccaaa aatctcaaaa 600 

atctcaaaag gaaggtaaat tcgacaatga aattgtacct gttaccatta agggatttag 660 

aggtaagcct gatactcaag tcacgaagga cgaggaacct gctagattac acgttgaaaa 720 

attgagatct gcaaggactg ttttccaaaa agaaaacggt actgttactg ccgctaacgc 780 

ttctccaatc aacgatggtg ctgcagccgt catcttggtt tccgaaaaag ttttgaagga 84 0 

aaagaatttg aagcctttgg ctattatcaa aggttggggt gaggccgctc atcaaccagc 900 

tgattttaca tgggctccat ctcttgcagt tccaaaggct ttgaaacatg ctggcatcga 960 

agacatcaat tctgttgatt actttgaatt caatgaagcc ttttcggttg tcggtttggt 1020 

gaacactaag attttgaagc tagacccatc taaggttaat gtatatggtg gtgctgttgc 1080 

tctaggtcac ccattgggtt gttctggtgc tagagtggtt gttacactgc tatccatctt 114 0 

acagcaagaa ggaggtaaga tcggtgttgc cgccatttgt aatggtggtg gtggtgcttc 1200 

ctctattgtc attgaaaaga tatgaggatc ctctagatgc gcaggaggca catatggcga 1260 

agaacgttgg gattttggct atggatatct atttccctcc cacctgtgtt caacaggaag 1320 

ctttggaagc acatgatgga gcaagtaaag ggaaatacac tattggactt ggccaagatt 1380 

gtttagcttt ttgcactgag cttgaagatg ttatctctat gagtttcaat gcggtgacat 1440 

cactttttga gaagtataag attgacccta accaaatcgg gcgtcttgaa gtaggaagtg 1500 

agactgttat tgacaaaagc aagtccatca agaccttctt gatgcagctc tttgagaaat 1560 

gtggaaacac tgatgtcgaa ggtgttgact cgaccaatgc ttgctatggt ggaactgcag 1620 

ctttgttaaa ctgtgtcaat tgggttgaga gtaactcttg ggatggacgt tatggcctcg 1680 

tcatttgtac tgacagcgcg gtttatgcag aaggacccgc aaggcccact ggaggagctg 17 4 0 

cagcgattgc tatgttgata ggtcctgatg ctcctatcgt tttcgaaagc aaattgagag 1800 

caagccacat ggctcatgtc tatgactttt . acaagcccaa tcttgctagc gagtacccgg 1860 

ttgttgatgg taagctttca cagacttgct acctcatggc tcttgactcc tgctataaac 1920 
atttatgcaa caagttcgag aagatcgagg gcaaagagtt ctccataaat gatgctgatt . 1980 

acattgtttt ccattctcca tacaataaac ttgtacagaa aagctttgct cgtctcttgt 204 0 

acaacgactt cttgagaaac gcaagctcca ttgacgaggc tgccaaagaa aagttcaccc 2100 
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cttattcatc 


tttgaccctt 


gacgagagtt 


accaaagccg tgatcttgaa 


aaggtgtcac 


2160 


aacaaattgc 


gaaaccgttt 


tatgatgcta 


aagtgcaacc 


aacgacttta 


ataccaaagg 


2220 


aagtcggtaa 


catgtacact 


gcttctctct 


acgctgcatt 


tgcttccctc 


atccacaaga 


2280 


aacacaatga 


tttggcggga 


aagcgggtgg 


ttatgttctc ttatggaagt 


ggctcaaccg 


2340 


caacaatgtt 


ctcattacgc 


ctcaacgaca 


ataagcctcc tttcagcatt 


tcaaacattg 


2400 


catctgtaat 


ggatgttggc 


ggtaaattga 


aagctagaca tgagtatgca 


cctgagaagt 


2460 


ttgtggagac 


aatgaagcta 


atggaacata 


ggtatggagc aaaggacttt 


gtgacaacca 


2520 


aggagggtat 


tatagatctt 


ttggcaccgg 


gaacttatta 


tctgaaagag 


gttgattcct 


2580 


tgtaccggag 


attctatggc 


aagaaaggtg 


aagatggatc 


tgtagccaat 


ggacactgag 


2640 


gatccgtcga 


gcacgtggag 


gcacatatgc 


aatgctgtga 


gatgcctgtt 


ggatacattc 


2700 


agattcctgt 


tgggattgct 


ggtccattgt 


tgcttgatgg ttatgagtac 


tctgttccta 


2760 


tggctacaac 


cgaaggttgt 


ttggttgcta 


gcactaacag aggctgcaag 


gctatgttta 


2820 


tctctggtgg 


cgccaccagt 


accgttctta 


aggacggtat 


gacccgagca 


cctgttgttc 


2880 


ggttcgcttc 


ggcgagacga 


gcttcggagc 


ttaagttttt 


cttggagaat 


ccagagaact 


2940 


ttgatacttt 


ggcagtagtc 


ttcaacaggt 


cgagtagatt 


tgcaagactg 


caaagtgtta 


3000 


aatgcacaat 


cgcggggaag 


aatgcttatg 


taaggttctg ttgtagtact 


ggtgatgcta 


3060 


tggggatgaa 


tatggtttct 


aaaggtgtgc 


agaatgttct 


tgagtatctt 


accgatgatt 


3120 


tccctgacat 


ggatgtgatt 


ggaatctctg 


gtaacttctg 


ttcggacaag 


aaacctgctg 


3180 


ctgtgaactg 


gattgaggga 


cgtggtaaat 


cagttgtttg 


cgaggctgta 


atcagaggag 


3240 


agatcgtgaa 


caaggtcttg 


aaaacgagcg 


tggctgcttt 


agtcgagctc 


aacatgctca 


3300 


agaacctagc 


tggctctgct 


gttgcaggct 


ctctaggtgg 


attcaacgct 


catgccagta 


3360 


acatagtgtc 


tgctgtattc 


atagctactg 


gccaagatcc 


agctcaaaac 


gtggagagtt 


3420 


ctcaatgcat 


caccatgatg 


gaagctatta 


atgacggcaa 


agatatccat 


atctcagtca 


3480 


ctatgccatc 


tatcgaggtg 


gggacagtgg 


gaggaggaac 


acagcttgca 


tctcaatcag 


3540 


cgtgtttaaa 


cctgctcgga 


gttaaaggag 


caagcacaga 


gtcgccggga 


atgaacgcaa 


3600' 


ggaggctagc 


gacgatcgta 


gccggagcag 


ttttagctgg 


agagttatct 


ttaatgtcag 


3660 


caattgcagc 


tggacagctt 


gtgagaagtc 


acatgaaata 


caatagatcc 


agccgagaca 


3720 


tctctggagc 


aacgacaacg 


acaacaacaa 


caacatgacc 


cgggatccgg 


ccgcaggagg 


3780 


agttcatatg 


tcagagttga 


gagccttcag 


tgccccaggg 


aaagcgttac 


tagctggtgg 


3840 


atatttagtt. 


ttagatacaa 


aatatgaagc 


atttgtagtc 


ggattatcgg 


caagaatgca 


3900 


tgctgtagcc 


catccttacg 


gttcattgca 


agggtctgat 


aagtttgaag 


tgcgtgtgaa 


3960 


aagtaaacaa 


tttaaagatg 


gggagtggct 


gtaccatata 


agtcctaaaa 


gtggcttcat 


4020 
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tcctgtttcg ataggcggat ctaagaaccc tttcattgaa aaagttatcg ctaacgtatt 4080 

tagctacttt aaacctaaca tggacgacta ctgcaataga aacttgttcg ttattgatat 4140 

tttctctgat gatgcctacc attctcagga ggatagcgtt accgaacatc gtggcaacag 4200 

aagattgagt tttcattcgc acagaattga agaagttccc aaaacagggc tgggctcctc 4260 

ggcaggttta gtcacagttt taactacagc tttggcctcc ttttttgtat cggacctgga 4320 

aaataatgta gacaaatata gagaagttat tcataattta gcacaagttg ctcattgtca 4380 

agctcagggt aaaattggaa gcgggtttga tgtagcggcg gcagcatatg gatctatcag 4440 

atatagaaga ttcccacccg cattaatctc taatttgcca gatattggaa gtgctactta 4500 

cggcagtaaa ctggcgcatt tggttgatga agaagactgg aatattacga ttaaaagtaa 4560 

ccatttacct tcgggattaa ctttatggat gggcgatatt aagaatggtt cagaaacagt 4 620 

aaaactggtc cagaaggtaa aaaattggta tgattcgcat atgccagaaa gcttgaaaat 4 680 

atatacagaa ctcgatcatg caaattctag atttatggat ggactatcta aactagatcg 4740 

cttacacgag actcatgacg attacagcga tcagatattt gagtctcttg agaggaatga 4 800 

ctgtacctgt caaaagtatc ctgaaatcac agaagttaga gatgcagttg ccacaattag 4860 

acgttccttt agaaaaataa ctaaagaatc tggtgccgat atcgaacctc ccgtacaaac 4920 

tagcttattg gatgattgcc agaccttaaa aggagttctt acttgcttaa tacctggtgc 4980 

tggtggttat gacgccattg cagtgattac taagcaagat gttgatctta gggctcaaac 5040 

cgctaatgac aaaagatttt ctaaggttca atggctggat gtaactcagg ctgactgggg 5100 

tgttaggaaa gaaaaagatc cggaaactta tcttgataaa ctgcaggagg agttttaatg 5160 

tcattaccgt tcttaacttc tgcaccggga aaggttatta tttttggtga acactctgct 5220 

gtgtacaaca agcctgccgt cgctgctagt gtgtctgcgt tgagaaccta cctgctaata. 5280 

agcgagtcat ctgcaccaga tactattgaa ttggacttcc cggacattag ctttaatcat 5340 

aagtggtcca tcaatgattt caatgccatc accgaggatc aagtaaactc ccaaaaattg 5400 

gccaaggctc aacaagccac cgatggcttg tctcaggaac tcgttagtct tttggatccg 5460 

ttgttagctc aactatccga atccttccac taccatgcag cgttttgttt cctgtatatg 5520 

tttgtttgcc tatgccccca tgccaagaat attaagtttt ctttaaagtc tactttaccc 5580 

atcggtgctg ggttgggctc aagcgcctct atttctgtat cactggcctt agctatggcc 564 0 

tacttggggg ggttaatagg atctaatgac ttggaaaagc tgtcagaaaa cgataagcat 5700 

atagtgaatc aatgggcctt cataggtgaa aagtgtattc acggtacccc ttcaggaata 5760 

gataacgctg tggccactta tggtaatgcc ctgctatttg aaaaagactc acataatgga 5820 

acaataaaca caaacaattt taagttctta gatgatttcc cagccattcc aatgatccta 5880 

acctatacta gaattccaag gtctacaaaa gatcttgttg ctcgcgttcg tgtgttggtc 5940 
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acccracTaaat ttcctaaaat 


tat craacrppa 


a t" 1" p1~ art a i~ rr 
a l. LLLayaty 


ppa 4~ nrrrrf" rra 
v_L>a Lyyy uy a 


duy LyLLLLd 


DUUU 


caaggcttag agatcatgac 


t aaatt a a erf 


a a a 4~ rrt* a a a rr 
aaa l. y Laaay 


rTPaorrraiTTa 
yLduoya Lya 


pna rr rr f~*4~ rr4~ a 
uy ay y u u y u a. 


DUDU 


aaaactaata ataaactata 


t era a pa a pi - a 


1 4* nrra a 4- 4- rra 
l. uyyddL iya 


4 m aarraa4-aaa 
Lddydd Laaa 


LLdLyydLLy 




cttcrtpt" paa t" rnnf rrt* 4-4- r» 


I'paf'r 'p4~ nrta 
LOa.UCCL.yya 


4* +■ ^ at ^ ^ as 4- 4~ ~» 
L LayaaCtUa 


4" 4" —s 4~ as 4~ 

l caaaaa lc u 


gagegatgat 


6180 


ttcraoaattcf ctp1~ pr*^ ^ 


a p4" 4~ a pp*prrr4~ 
qll LdLuyy l 


AT AS 4- AT AT 4- ATAS-ASAf 

y CLygcggcg 


AY AS AT AT 4" 4" ATAs4~ AS 

ycy y cege lc 


4-4-4-at-jat4--(-4-at 

l LugacuL Ly 






arfarfpaaaff 
aydy tdddi L 


Afa AS->A«rAs4~4 _ AS^ 

ydCdycttCa 


3^5^ /T fs — \ 4™ 4~ 

aaaagaaa l l 


gcaagatgat 


r t a n 

6300 


f a rrh 4* -a as at arra<-»a4-4-4-rra 
LLLay LLaLL) dydLaLLLyd 


ra as — \ at — > ai 4- 4— 

aacagacT, i,g 


99*tgggactg 


getgetgttt 


gttaagegea 


6360 


aa.ciacii_i_L.ycl a LaaaLj aLDL 


4~ r!!_ TT5 4~ r> ^% TS ~\ 

tddddLCada 


tccctagtat 


tccaattatt 


tgaaaataaa 


6420 


ci u LQLUfluad ayLaauadaL 


4-«3 n f~r 4~ p4~ !3 

LydCydlCLa 


ttattgccag 


gaaacacgaa 


tttaccatgg 


64 80 


duLLLaodyy dy g ay L l L ua 


_\ 4* or "■*« as 4— /-r 4* — \ 4— 

a ugacugua u 


at actgetag 


tgtaactget 


ceggtaaata 


6540 


LCyLLdULLL LddLJ LdLLyy 


gggaaaaggg 


acacgaagtt 


gaatctgccc 


accaattcgt 


6600 


ppa f* a 4* p»a n't rrarff f afr»n 


paarraf frapp 
Lddyd lydt-L 


4™ at ^ rr 5 % as rr 4~ 4" 

LCayaaCy L u 


gacct ctgcg 


get act gca c 


fobbU 


ct cracrt 1" 4- /-ra arrrrnarapt" 


4" +" rr4" pt PT+" 4- a a 
LLyLyyuLaa 


a 4" AT AT -J AS- *s AS AS 

a cy y agaacc 


acacagca lc 


gacaatgaaa 


/*TOP< 


naartraap A 4r4^pr4r p4t atasas-as 


yaLLLdLyLL 


adL l aagaaa 


ggaaatggaa 


tcgaaggacg 


pn o n 

678 0 


npf na 4- 4- rrr*r* r> a 0 a 4* 4- a 4- 0 4~ 


caaeggaaac 


tccacattgt 


ctccgaaaat 


aactttccta 


6840 


rs <_» rrpa pf nt" pt/t 4"4- 4* a/"*/"i4- 4- rip 

L-cayuayLtyy l l tag c lu.cc 


4- /-»/-> AT AS 4- AT AS 4™ AT 

Lccgc Lgciig 


getttgetge 


attggtctct 


geaattgeta 


6900 


dy LUdLdotd aLuaccacag 


tcaacttcag 


aaatatctag 


aatagcaaga 


aaggggtctg 


6960 


ft4~ 4~ p a pp4~ 4* or 4" as* 4- as at 4- 4- at 

y l. Lcay c l uy cagatcy tty 


tttggcggat 


acgtggcctg 


ggaaatggga 


aaagctgaag 


7020 


duggucauga uLCcarggca 


gtacaaatcg 


cagacagctc 


tgactggect 


cagatgaaag 


7080 


p4" 4~ rr4~ rxY pp4* arr4- 4~ rr+- pa/s-As 
L-LLy LyLLLL ay l uy LCdyC 


gacaLcaaaa 


aggatgtgag 


ttccactcag 


ggtatgcaat 


7140 


^ a r\ c c Ci\~ cx ci c aapplrppaa 
uyau^y tyyu auL-L- Lttyaa 


as4- ^ 4- 4- 4~ a 3 0 at 
LLdLLLdddy 


aaagaattga 


acatgtcgta 


ccaaagagat 


/200 


tto/aacrtpat nrnt"3a3p;pp 

LLyaay luql yuy Laadyuu 


dLiy LLyadd 


aapraH"+" pptas 
ddyaLttCyC 


^ ^» 4" 4* /^r *_s 

cacc l l l gca 


aaggaaacaa 




t" na 4~ npra 4- +■ p paapfpfl-fp 
L.y a uy y a u ll uaaULULLLL 


Pafrrppapa^ 

CdLy cedea L 


/T I ' 4" 4" AT AT «-\ AS 4* AS 

g l. l eggae l c 


tttccctcca 


atat tctaca 


•"7 O O A 

/ 320 


taaataacac ttcraaorrrt" 


a+/pafpa rr4~ 4* 
a Luauiydy L.L. 


ftnt/pppapa as 

y y LytLdLa l> 


Asa4~4— a of" Aia at - 
LdlLddLLdlJ 


u L u l aeggag 


/ jo U 


aaacaatncrt f rrpa^apaprr 


4- 4- 4- rr a 4- rr P a rr 
l. l» uyaLy uay 


y LLUaadLLJt 


4" at4~ at4~ 4- at 4- -a as 

tyty tuytac 


4* - SAs4 _ 4- — 1 AT AS 4" AT 

Lac l uagctg 


"7 A a n 
/ ft fl u 


aaaatgagtc gaaactcttt 


gcatttatct. 


ataaattgtt 


tggctctgtt 


cctggatggg 


7500 


acaagaaatt tactactgag 


cagcttgagg 


ctttcaacca 


tcaatttgaa 


tcatctaact 


7560 


ttactgcacg tgaattggat 


cttgagttgc 


aaaaggatgt 


tgccagagtg 


attttaactc 


7620 


aagtcggttc aggcccacaa 


gaaacaaacg 


aatctttgat 


tgacgcaaag 


actggtctac 


7680 


caaaggaata act 










7693 
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<220> 

<223> Operon B containing A. thaliana and S. cerevisiae DNA 



<400> 59 
ggccgcagga 


ggagttcata 


tgtcagagtt 


gagagccttc 


agtgccccag 


ggaaagcgtt 


60 


actagctggt 


ggatatttag 


ttttagatac 


aaaatatgaa 


gcatttgtag 


tcggattatc 


120 


ggcaagaatg 


catgctgtag 


cccatcctta 


cggttcattg 


caagggtctg 


ataagtttga 


180 


agtgcgtgtg 


aaaagtaaac 


aatttaaaga 


tggggagtgg 


ctgtaccata 


taagtcctaa 


240 


aagtggcttc 


attcctgttt 


cgataggcgg 


atctaagaac 


cctttcattg 


aaaaagttat 


300 


cgctaacgta 


tttagctact 


ttaaacctaa 


catggacgac 


tactgcaata 


gaaacttgtt 


360 


cgttattgat 


attttctctg 


atgatgccta 


ccattctcag 


gaggatagcg 


ttaccgaaca 


420 


tcgtggcaac 


agaagattga 


gttttcattc 


gcacagaatt 


gaagaagttc 


ccaaaacagg 


480 


gctgggctcc 


tcggcaggtt 


tagtcacagt 


tttaactaca 


gctttggcct 


ccttttttgt 


540 


atcggacctg 


gaaaataatg 


tagacaaata 


tagagaagtt 


attcataatt 


tagcacaagt 


600 


tgctcattgt 


caagctcagg 


gtaaaattgg 


aagcgggttt 


gatgtagcgg 


cggcagcata 


660 


tggatctatc 


agatatagaa 


gattcccacc 


cgcattaatc 


tctaatttgc 


cagatattgg 


720 


aagtgctact 


tacggcagta 


aactggcgca 


tttggttgat 


gaagaagact 


ggaatattac 


780 


gattaaaagt 


aaccatttac 


cttcgggatt 


aactttatgg 


atgggcgata 


ttaagaatgg 


840 


ttcagaaaca 


gtaaaactgg 


tccagaaggt 


aaaaaattgg 


tatgattcgc 


atatgccaga 


900 


aagcttgaaa 


atatatacag 


aactcgatca 


tgcaaattct 


agatttatgg 


atggactatc 


960 


taaactagat 


cgcttacacg 


agactcatga 


cgattacagc 


gatcagatat 


ttgagtctct 


1020 


tgagaggaat 


gactgtacct 


gtcaaaagta 


tcctgaaatc 


acagaagtta 


gagatgcagt 


1080 


tgccacaatt 


agacgttcct 


ttagaaaaat 


aactaaagaa 


tctggtgccg 


atatcgaacc 


1140 


tcccgtacaa 


actagcttat 


tggatgattg 


ccagacctta 


aaaggagttc 


ttacttgctt 


1200 


aatacctggt 


•gctggtggtt 


atgacgccat 


tgcagtgatt 


actaagcaag 


atgttgatct 


1260 


tagggctcaa 


accgctaatg 


acaaaagatt 


ttctaaggtt 


caatggctgg 


atgtaactca • 


1320 


ggctgactgg 


ggtgttagga 


aagaaaaaga 


tccggaaact : 


tatcttgata 


aactgcagga 


1380 


ggagttttaa 


tgtcattacc 


gttcttaact 


tctgcaccgg 


gaaaggttat 


tatttttggt 


1440 


gaacactctg 


ctgtgtacaa' 


caagcctgcc 


gtcgctgcta 


gtgtgtctgc 


gttgagaacc 


1500 


tacctgctaa 


taagcgagtc 


atctgcacca 


gatactattg 


aattggactt 


cccggacatt 


1560 
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agctttaatc ataagtggtc catcaatgat ttcaatgcca tcaccgagga tcaagtaaac 1620 

tcccaaaaat tggccaaggc tcaacaagcc accgatggct tgtctcagga actcgttagt 1680 

cttttggatc cgttgttagc tcaactatcc gaatccttcc actaccatgc agcgttttgt 174 0 

ttcctgtata tgtttgtttg cctatgcccc catgccaaga atattaagtt ttctttaaag 1800 

tctactttac ccatcggtgc tgggttgggc tcaagcgcct ctatttctgt atcactggcc 18 60 

ttagctatgg cctacttggg ggggttaata ggatctaatg acttggaaaa gctgtcagaa 1920 

aacgataagc atatagtgaa tcaatgggcc ttcataggtg aaaagtgtat tcacggtacc 1980 

ccttcaggaa tagataacgc tgtggccact tatggtaatg ccctgctatt tgaaaaagac 204 0 

tcacataatg gaacaataaa cacaaacaat tttaagttct tagatgattt cccagccatt 2100 

ccaatgatcc taacctatac tagaattcca aggtctacaa aagatcttgt tgctcgcgtt 2160 

cgtgtgttgg tcaccgagaa atttcctgaa gttatgaagc caattctaga tgccatgggt 2220 

gaatgtgccc tacaaggctt agagatcatg actaagttaa gtaaatgtaa aggcaccgat 2280 

gacgaggctg tagaaactaa taatgaactg tatgaacaac tattggaatt gataagaata 234 0 

aatcatggac tgcttgtctc aatcggtgtt tctcatcctg gattagaact tattaaaaat 24 00 

ctgagcgatg atttgagaat tggctccaca aaacttaccg gtgctggtgg cggcggttgc 24 60 

tctttgactt tgttacgaag agacattact caagagcaaa ttgacagctt caaaaagaaa 2520 

ttgcaagatg attttagtta cgagacattt gaaacagact tgggtgggac tggctgctgt 2580 

ttgttaagcg caaaaaattt gaataaagat cttaaaatca aatccctagt attc.caatta 264 0 

tttgaaaata aaactaccac aaagcaacaa attgacgatc tattattgcc aggaaacacg 2700 

aatttaccat ggacttcaga cgaggagttt taatgactgt atatactgct agtgtaactg 27 60 

ctccggtaaa tattgctact cttaagtatt gggggaaaag ggacacgaag ttgaatctgc 2820 

ccaccaattc gtccatatca gtgactttat cgcaagatga cctcagaacg ttgacctctg 2880 

cggctactgc acctgagttt gaacgcgaca ctttgtggtt aaatggagaa ccacacagca 294 0 

tcgacaatga aagaactcaa aattgtctgc gcgacctacg ccaattaaga aaggaaatgg 3000 

aatcgaagga cgcctcattg cccacattat ctcaatggaa actccacatt gtctccgaaa 3060 

ataactttcc tacagcagct ggtttagctt cctccgctgc tggctttgct gcattggtct 3120 

ctgcaattgc taagttatac caattaccac agtcaacttc agaaatatct agaatagcaa 3180 

gaaaggggtc tggttcagct tgtagatcgt tgtttggcgg atacgtggcc tgggaaatgg 3240 

gaaaagctga agatggtcat gattccatgg cagtacaaat cgcagacagc tctgactggc 3300 

ctcagatgaa agcttgtgtc ctagttgtca gcgatattaa aaaggatgtg agttccactc 3360 

agggtatgca attgaccgtg gcaacctccg aactatttaa agaaagaatt gaacatgtcg 3420 

taccaaagag atttgaagtc atgcgtaaag ccattgttga aaaagatttc gccacctttg 34 8 0 
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caaaggaaac aatgatggat tccaactctt tccatgccac 
caatattcta catgaatgac acttccaagc gtatcatcag 
agttttacgg agaaacaatc gttgcataca cgtttgatgc 
actacttagc tgaaaatgag tcgaaactct ttgcatttat 
ttcctggatg ggacaagaaa tttactactg agcagcttga 
aatcatctaa ctttactgca cgtgaattgg atcttgagtt 
tgattttaac tcaagtcggt tcaggcccac aagaaacaaa 
agactggtct accaaaggaa gaggagtttt aactcgacgc 
cagaacgttt acattgtatc gactgccaga accccaattg 
tcctccaaga cagcagtgga attgggtgct gttgctttaa 
ccagaattgg atgcatccaa ggattttgac gaaattattt 
aatttgggcc aagctccggc cagacaagtt gctttggctg 
gttgcaagca cagttaacaa ggtctgtgca tccgctatga 
caatccatca aatgtggtaa tgctgatgtt gtcgtagctg 
aacgcaccat actacatgcc agcagcccgt gcgggtgcca 
gttgatggtg tcgaaagaga tgggttgaac gatgcgtacg 
cacgcagaaa agtgtgcccg tgattgggat attactagag 
atcgaatcct accaaaaatc tcaaaaatct caaaaggaag 
gtacctgtta ccattaaggg atttagaggt aagcctgata 
gaacctgcta gattacacgt tgaaaaattg agatctgcaa 
aacggtactg ttactgccgc taacgcttct ccaatcaacg 
ttggtttccg aaaaagtttt gaaggaaaag aatttgaagc 
tggggtgagg ccgctcatca accagctgat tttacatggg 
aaggctttga aacatgctgg catcgaagac atcaattctg 
gaagcctttt cggttgtcgg tttggtgaac actaagattt 
gttaatgtat atggtggtgc tgttgctcta ggtcacccat 
gtggttgtta cactgctatc catcttacag caagaaggag 
atttgtaatg gtggtggtgg tgcttcctct attgtcattg 
agatgcgcag gaggcacata tggcgaagaa cgttgggatt 
ccctcccacc tgtgttcaac aggaagcttt ggaagcacat 
atacactatt ggacttggcc aagattgttt agctttttgc 
ctctatgagt ttcaatgcgg tgacatcact ttttgagaag 



atgtttggac tctttccctc 


3540 


ttggtgccac accattaatc 


3600 


aggtccaaat gctgtgttgt 


3660 


ctataaattg tttggctctg 


3720 


ggctttcaac catcaatttg 


3780 


gcaaaaggat gttgccagag 


3840 


cgaatctttg attgacgcaa 


3900 


cggcggaggc acatatgtct 


3960 


gttcattcca gggttctcta 


4020 


aaggcgcctt ggctaaggtt 


4080 


ttggtaacgt tctttctgcc 


4140 


ccggtttgag taatcatatc 


4200 


aggcaatcat tttgggtgct 


4260 


gtggttgtga atctatgact 


4320 


aatttggcca aactgttctt 


4380 


atggtctagc catgggtgta 


4440 


aacaacaaga caattttgcc 


» 4500 


gtaaattcga caatgaaatt 


4560 


ctcaagtcac gaaggacgag 


4620 


ggactgtttt ccaaaaagaa 


4680 


atggtgctgc agccgtcatc 


4740 


ctttggctat tatcaaaggt 


4800 


ctccatctct tgcagttcca 


4860 


ttgattactt tgaattcaat 


4920 


tgaagctaga cccatctaag 


4980 


tgggttgttc tggtgctaga 


5040 


gtaagatcgg tgttgccgcc 


5100 


aaaagatatg aggatcctct 


5160 


ttggctatgg atatctattt 


5220 


gatggagcaa gtaaagggaa 


5280 


actgagcttg aagatgttat 


5340 


tataagattg accctaacca 


5400 
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aatcgggcgt cttgaagtag gaagtgagac tgttattgac aaaagcaagt ccatcaagac 54 60 

cttcttgatg cagctctttg agaaatgtgg aaacactgat gtcgaaggtg ttgactcgac 5520 

caatgcttgc tatggtggaa ctgcagcttt gttaaactgt gtcaattggg ttgagagtaa . 5580 

ctcttgggat ggacgttatg gcctcgtcat ttgtactgac agcgcggttt atgcagaagg 5640 

acccgcaagg cccactggag gagctgcagc gattgctatg ttgataggac ctgatgctcc . 5700 

tatcgttttc gaaagcaaat tgagagcaag ccacatggct catgtctatg acttttacaa 57 60 

gcccaatctt gctagcgagt acccggttgt tgatggtaag ctttcacaga cttgctacct 5820 

catggctctt gactcctgct ataaacattt atgcaacaag ttcgagaaga tcgagggcaa 5880 

agagttctcc ataaatgatg ctgattacat tgttttccat tctccataca ataaacttgt 5940 

acagaaaagc tttgctcgtc tcttgtacaa cgacttcttg agaaacgcaa gctccattga 6000 

cgaggctgcc aaagaaaagt tcacccctta ttcatctttg acccttgacg agagttacca 6060 

aagccgtgat cttgaaaagg tgtcacaaca aatttcgaaa ccgttttatg atgctaaagt 6120 

gcaaccaacg actttaatac caaaggaagt cggtaacatg tacactgctt ctctctacgc 6180 

tgcatttgct tccctcatcc acaataaaca caatgatttg gcgggaaagc gggtggttat 624 0 

gttctcttat ggaagtggct ccaccgcaac aatgttctca ttacgcctca acgacaataa 6300 

gcctcctttc agcatttcaa acattgcatc tgtaatggat gttggcggta aattgaaagc 6360 

tagacatgag tatgcacctg agaagtttgt ggagacaatg aagctaatgg aacataggta 6420 

tggagcaaag gactttgtga caaccaagga gggtattata gatcttttgg cacogggaac 64 80 

ttattatctg aaagaggttg attccttgta ccggagattc tatggcaaga aaggtgaaga 6540 

tggatctgta gccaatggac actgaggatc cgtcgagcac gtggaggcac atatgcaatg 6600 

ctgtgagatg cctgttggat acattcagat tcctgttggg attgctggtc cattgttgct 6660 

tgatggttat gagtactctg ttcctatggc tacaaccgaa ggttgtttgg ttgctagcac 6720 

taacagaggc tgcaaggcta tgtttatctc tggtggcgcc accagtaccg ttcttaagga 6780 

cggtatgacc cgagcacctg ttgttcggtt cgcttcggcg agacgagctt cggagcttaa 6840 

gtttttcttg gagaatccag agaactttga tactttggca gtagtcttca acaggtcgag 6900 

tagatttgca agactgcaaa gtgttaaatg cacaatcgcg gggaagaatg cttatgtaag 6960 

gttctgttgt agtactggtg atgctatggg gatgaatatg gtttctaaag gtgtgcagaa 7020 

tgttcttgag tatcttaccg atgatttccc tgacatggat gtgattggaa tctctggtaa 7080 

cttctgttcg gacaagaaac ctgctgctgt gaactggatt gagggacgtg gtaaatcagt 7140 

tgtttgcgag gctgtaatca gaggagagat cgtgaacaag gtcttgaaaa cgagcgtggc 7200 

tgctttagtc gagctcaaca tgctcaagaa cctagctggc tctgctgttg caggctctct 7260 

aggtggattc aacgctcatg ccagtaacat agtgtctgct gtattcatag ctactggcca 7320 
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agatccagct caaaacgtgg agagttctca atgcatcacc atgatggaag ctattaatga -7380 

cggcaaagat atccatatct cagtcactat gccatctatc gaggtgggga cagtgggagg 7440 

aggaacacag cttgcatctc aatcagcgtg tttaaacctg ctcggagtta aaggagcaag 7500 

cacagagtcg ccgggaatga acgcaaggag gctagcgacg atcgtagccg gagcagtttt 7560 

agctggagag ttatctttaa tgtcagcaat tgcagctgga cagcttgtga gaagtcacat 7 620 

gaaatacaat agatccagcc gagacatctc tggagcaacg acaacgacaa caacaacaac 7680 

atgacccggg atccg 7 695 

<210> 60 
<211> 8235 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Operon C containing A. thaliana, S. cerevisiae, and R. caosulatus DNA 



<400> 60 
ggccgcagga 


ggagttcata 


tgtcagagtt 


gagagccttc 


agtgccccag 


ggaaagcgtt 


60 


actagctggt 


ggatatttag 


ttttagatac 


aaaatatgaa 


gcatttgtag 


tcggattatc 


120 


ggcaagaatg 


catgctgtag 


cccatcctta 


cggttcattg 


caagggtctg 


ataagtttga 


180 


agtgcgtgtg 


aaaagtaaac 


aatttaaaga 


tggggagtgg 


ctgtaccata 


taagtcctaa 


240 


aagtggcttc 


attcctgttt 


cgataggcgg 


atctaagaac 


cctttcattg 


aaaaagttat 


. 300 


cgctaacgta 


tttagctact 


ttaaacctaa 


catggacgac 


tactgcaata 


gaaacttgtt 


360 


cgttattgat 


attttctctg 


atgatgccta 


ccattctcag 


gaggatagcg 


ttaccgaaca 


420 


tcgtggcaac 


agaagattga 


gttttcattc 


gcacagaatt 


gaagaagttc 


ccaaaacagg 


480 


gctgggctcc 


tcggcaggtt 


tagtcacagt 


tttaactaca 


gctttggcct 


ccttttttgt 


540 


atcggacctg 


gaaaataatg 


tagacaaata 


tagagaagtt 


attcataatt 


tagcacaagt 


600 


tgctcattgt 


caagctcagg 


gta-aaattgg 


aagcgggttt 


gatgtagcgg 


cggcagcata 


660 


tggatctatc 


agatatagaa 


gattcccacc 


cgcattaatc 


tctaatttgc 


cagatattgg 


720 


aagtgctact 


tacggcagta 


aactggcgca 


tttggttgat 


gaagaagact 


ggaatattac 


780 


gattaaaagt 


aaccatttac 


cttcgggatt 


aactttatgg 


atgggcgata 


ttaagaatgg 


840 


ttcagaaaca 


gtaaaactgg 


tccagaaggt 


aaaaaattgg 


tatgattcgc 


atatgccaga 


900 


aagcttgaaa 


atatatacag 


aactcgatca 


tgcaaattct 


agatttatgg 


atggactatc 


960 


taaactagat 


cgcttacacg 


agactcatga 


cgattacagc 


gatcagatat 


ttgagtctct 


1020 
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tgagaggaat gactgtacct gtcaaaagta tcctgaaatc acagaagtta gagatgcagt 1080 

tgccacaatt agacgttcct ttagaaaaat aactaaagaa tctggtgccg atatcgaacc 1140 

tcccgtacaa actagcttat tggatgattg ccagacctta aaaggagttc ttacttgctt 1200 

aatacctggt gctggtggtt atgacgccat tgcagtgatt actaagcaag atgttgatct 1260 

tagggctcaa accgctaatg acaaaagatt ttctaaggtt caatggctgg atgtaactca 1320 

ggctgactgg ggtgttagga aagaaaaaga tccggaaact tatcttgata aactgcagga 1380 

ggagttttaa tgtcattacc gttcttaact tctgcaccgg gaaaggttat tatttttggt 1440 

gaacactctg ctgtgtacaa caagcctgcc gtcgctgcta gtgtgtctgc gttgagaacc 1500 

tacctgctaa taagcgagtc atctgcacca gatactattg aattggactt cccggacatt 1560 

agctttaatc ataagtggtc catcaatgat ttcaatgcca tcaccgagga tcaagtaaac 1620 

tcccaaaaat tggccaaggc tcaacaagcc accgatggct tgtctcagga actcgttagt 1680 

cttttggatc cgttgttagc tcaactatcc gaatccttcc actaccatgc agcgttttgt 1740 

ttcctgtata tgtttgtttg cctatgcccc catgccaaga atattaagtt ttctttaaag 1800 

tctactttac ccatcggtgc tgggttgggc tcaagcgcct ctatttctgt atcactggcc 1860 

ttagctatgg cctacttggg ggggttaata ggatctaatg acttggaaaa gctgtcagaa 1920 

aacgataagc atatagtgaa tcaatgggcc ttcataggtg aaaagtgtat tcacggtacc 1980 

ccttcaggaa tagataacgc tgtggccact tatggtaatg ccctgctatt tgaaaaagac 204 0 

tcacataatg gaacaataaa cacaaacaat tttaagttct tagatgattt cccagccatt 2100 

ccaatgatcc taacctatac tagaattcca aggtctacaa aagatcttgt tgctcgcgtt 2160 

cgtgtgttgg tcaccgagaa atttcctgaa gttatgaagc caattctaga tgccatgggt 2220 

gaatgtgccc tacaaggctt agagatcatg actaagttaa gtaaatgtaa aggcaccgat 2280 

gacgaggctg tagaaactaa taatgaactg tatgaacaac tattggaatt gataagaata 2340 

aatcatggac tgcttgtctc aatcggtgtt tctcatcctg gattagaact tattaaaaat 24 00 

ctgagcgatg atttgagaat tggctccaca aaacttaccg gtgctggtgg cggcggttgc 24 60 

tctttgactt tgttacgaag agacattact caagagcaaa ttgacagctt caaaaagaaa 2520 

ttgcaagatg attttagtta cgagacattt gaaacagact tgggtgggac tggctgctgt 2580 

ttgttaagcg caaaaaattt gaataaagat cttaaaatca aatccctagt attccaatta 2640 

tttgaaaata aaactaccac aaagcaacaa attgacgatc tattattgcc aggaaacacg 2700 

aatttaccat ggacttcaga cgaggagttt taatgactgt atatactgct agtgtaactg 2760. 

ctccggtaaa tattgctact cttaagtatt gggggaaaag ggacacgaag ttgaatctgc 2820 

ccaccaattc gtccatatca gtgactttat cgcaagatga cctcagaacg ttgacctctg 2880 

cggctactgc acctgagttt gaacgcgaca ctttgtggtt aaatggagaa ccacacagca 2940 
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tcgacaatga aagaactcaa aattgtctgc gcgacctacg ccaattaaga aaggaaatgg 3000 

aatcgaagga cgcctcattg cccacattat ctcaatggaa actccacatt gtctccgaaa 3060 

ataactttcc tacagcagct ggtttagctt cctccgctgc tggctttgct gcatt'ggtct 3120 

ctgcaattgc taagttatac caattaccac agtcaacttc agaaatatct agaatagcaa 318 0 

gaaaggggtc tggttcagct tgtagatcgt tgtttggcgg atacgtggcc tgggaaatgg 324 0 

gaaaagctga agatggtcat gattccatgg cagtacaaat cgcagacagc tctgactggc, 3300 

ctcagatgaa agcttgtgtc ctagttgtca gcgatattaa aaaggatgtg agttccactc 3360 

agggtatgca attgaccgtg gcaacctccg aactatttaa agaaagaatt gaacatgtcg 3420 

taccaaagag atttgaagtc atgcgtaaag ccattgttga aaaagatttc gccacctttg 3480 

caaaggaaac aatgatggat tccaactctt tccatgccac atgtttggac tctttccctc 3540 

caatattcta catgaatgac acttccaagc gtatcatcag ttggtgccac accattaatc 3600 

agttttacgg agaaacaatc gttgcataca cgtttgatgc aggtccaaat gctgtgttgt 3660 

actacttagc tgaaaatgag tcgaaactct ttgcatttat ctataaattg tttggctctg 3720 

ttcctggatg ggacaagaaa tttactactg agcagcttga ggctttcaac catcaatttg 3780 
aatcatctaa ctttactgca cgtgaattgg atcttgagtt gcaaaaggat gttgccagag . 384 0- 

tgattttaac tcaagtcggt tcaggcccac aagaaacaaa cgaatctttg attgacgcaa 3900 

agactggtct accaaaggaa gaggagtttt aactcgacgc cggcggaggc acatatgtct 3960 

cagaacgttt acattgtatc gactgccaga accccaattg gttcattcca gggttctcta 4020 

tcctccaaga cagcagtgga attgggtgct gttgctttaa aaggcgcctt ggctaaggtt 4080 

ccagaattgg atgcatccaa ggattttgac gaaattattt ttggtaacgt tctttctgcc 4140 

aatttgggcc aagctccggc cagacaagtt gctttggctg ccggtttgag taatcatatc 4200 

gttgcaagca cagttaacaa ggtctgtgca tccgctatga aggcaatcat tttgggtgct 4260 

caatccatca aatgtggtaa tgctgatgtt gtcgtagctg gtggttgtga atctatgact 4320 

aacgcaccat actacatgcc agcagcccgt gcgggtgcca aatttggcca aactgttctt 4380 

gttgatggtg tcgaaagaga tgggttgaac gatgcgtacg atggtctagc catgggtgta 4440 

cacgcagaaa agtgtgcccg tgattgggat attactagag aacaacaaga caattttgcc 4 500 

atcgaatcct accaaaaatc tcaaaaatct caaaaggaag gtaaattcga caatgaaatt 4560 

gtacctgtta ccattaaggg atttagaggt aagcctgata ctcaagtcac gaaggacgag 4 620 

gaacctgcta gattacacgt tgaaaaattg agatctgcaa ggactgtttt ccaaaaagaa. 4 680 

aacggtactg ttactgccgc taacgcttct ccaatcaacg atggtgctgc agccgtcatc 4740 

ttggtttccg aaaaagtttt gaaggaaaag aatttgaagc ctttggctat tatcaaaggt . 4800 

tggggtgagg ccgctcatca accagctgat tttacatggg ctccatctct tgcagttcca 4 860 
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aaggtgaaga 


6540 


tggatctgta gecaatggae actgaggatc cgtcgagcac 


gtggaggcac 


atatgeaatg 


6600 


ctgtgagatg cctgttggat acattcagat tcctgttggg 


attgctggtc 


cattgttgct 


6660 


tgatggttat gagtactctg ttcctatggc tacaaccgaa 


ggttgtttgg 


ttgetagcac 


6720 


taacagaggc tgcaaggcta tgtttatctc tggtggcgcc 


accagtaccg 


ttcttaagga 


6780 
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cggtatgacc cgagcacctg ttgttcggtt cgcttcggcg agacgagctt cggagcttaa 6840 

gtttttcttg gagaatccag agaactttga tactttggca gtagtcttca acaggtcgag 6900 

tagatttgca agactgcaaa gtgttaaatg cacaatcgcg gggaagaatg cttatgtaag 6960 

gttctgttgt agtactggtg atgctatggg gatgaatatg gtttctaaag gtgtgcagaa 7 020 

tgttcttgag tatcttaccg atgatttccc tgacatggat gtgattggaa tctctggtaa 7080 

cttctgttcg gacaagaaac ctgctgctgt gaactggatt gagggacgtg gtaaatcagt 7140 

tgtttgcgag gctgtaatca gaggagagat cgtgaacaag gtcttgaaaa cgagcgtggc 7200 

tgctttagtc gagctcaaca tgctcaagaa cctagctggc tctgctgttg caggctctct 7260 

aggtggattc aacgctcatg ccagtaacat agtgtctgct gtattcatag ctactggcca 7320 

agatccagct caaaacgtgg agagttctca atgcatcacc atgatggaag ctattaatga 7380 

cggcaaagat atccatatct cagtcactat gccatctatc gaggtgggga cagtgggagg 7440 

aggaacacag cttgcatctc aatcagcgtg tttaaacctg ctcggagtta aaggagcaag 7500 

cacagagtcg- ccgggaatga acgcaaggag gctagcgacg atcgtagccg gagcagtttt 7560 

agctggagag ttatctttaa tgtcagcaat tgcagctgga cagcttgtga gaagtcacat 7620 

gaaatacaat agatccagcc gagacatctc tggagcaacg acaacgacaa caacaacaac 7680 

atgacccgta aggaggcaca tatgagtgag cttatacccg cctgggttgg tgacagactg 7740 

gctccggtgg acaagttgga ggtgcatttg aaagggctcc gccacaaggc ggtgtctgtt 7800 

ttcgtcatgg atggcgaaaa cgtgctgatc cagcgccgct cggaggagaa atatcactct 7860 

cccgggcttt gggcgaacac ctgctgcacc catccgggct ggaccgaacg ccccgaggaa 7 920 

tgcgcggtgc ggcggctgcg cgaggagctg gggatcaccg ggctttatcc cgcccatgcc 7 980 

gaccggctgg aatatcgcgc cgatgtcggc ggcggcatga tcgagcatga ggtggtcgac 804 0- 

atctatctgg cctatgccaa accgcatatg cggatcaccc ccgatccgcg cgaagtggcc 8100 

gaggtgcgct ggatcggcct ttacgatctg gcggccgagg ccggtcggca tcccgagcgg 8160 

ttctcgaaat ggctcaacat ctatctgtcg agccatcttg accggatttt cggatcgatc 8220 

ctgcgcggct gagcg 8235 

<210> 61 
<211> 7681 
<212> DNA 

<213> Artificial Sequence 
<220> . 

<223> Operon C containing A. thaliana, S. cerevisiae, and. Streptomyces 
CL190 
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DNA, and R. capsulatus DNA 
<400> 61 

ggccgcgtcg actacggccg caggaggagt tcatatgtca gagttgagag ccttcagtgc 60 

cccagggaaa gcgttactag ctggtggata tttagtttta gatacaaaat atgaagcatt 120 

tgtagtcgga ttatcggcaa gaatgcatgc tgtagcccat ccttacggtt cattgcaagg 180 

gtctgataag tttgaagtgc gtgtgaaaag taaacaattt aaagatgggg agtggctgta 240 

ccatataagt cctaaaagtg gcttcattcc tgtttcgata ggcggatcta agaacccttt 300 

cattgaaaaa gttatcgcta acgtatttag ctactttaaa cctaacatgg acgactactg 360 

caatagaaac ttgttcgtta ttgatatttt ctctgatgat gcctaccatt ctcaggagga 4 20 

tagcgttacc gaacatcgtg gcaacagaag attgagtttt cattcgcaca gaattgaaga 4 80 
agttcccaaa acagggctgg gctcctcggc aggtttagtc acagttttaa ctacagcttt - 540 

ggcctccttt tttgtatcgg acctggaaaa taatgtagac aaatatagag aagttattca 600 

taatttagca caagttgctc attgtcaagc tcagggtaaa attggaagcg ggtttgatgt 660 

agcggcggca gcatatggat ctatcagata tagaagattc ccacccgcat taatctctaa 720 

tttgccagat attggaagtg ctacttacgg cagtaaactg gcgcatttgg ttgatgaaga 780 

agactggaat attacgatta aaagtaacca tttaccttcg ggattaactt tatggatggg 840 

cgatattaag aatggttcag aaacagtaaa actggtccag aaggtaaaaa attggtatga 900 

ttcgcatatg ccagaaagct tgaaaatata tacagaactc gatcatgcaa attctagatt 960 

tatggatgga ctatctaaac tagatcgctt acacgagact catgacgatt acagcgatca 1020 

gatatttgag tctcttgaga ggaatgactg tacctgtcaa aagtatcctg aaatcacaga 1080 

agttagagat gcagttgcca caattagacg ttcctttaga aaaataacta aagaatctgg 1140 

tgccgatatc gaacctcccg tacaaactag cttattggat gattgccaga ccttaaaagg 1200 

agttcttact tgcttaatac ctggtgctgg tggttatgac gccattgcag tgattactaa 1260 

gcaagatgtt gatcttaggg ctcaaaccgc taatgacaaa agattttcta aggttcaatg 1320 

gctggatgta actcaggctg actggggtgt taggaaagaa aaagatccgg aaacttatct 1380 

tgataaactg caggaggagt tttaatgtca ttaccgttct taacttctgc accgggaaag 1440 

gttattattt ttggtgaaca ctctgctgtg tacaacaagc ctgccgtcgc tgctagtgtg 1500 

tctgcgttga gaacctacct gctaataagc gagtcatctg caccagatac tattgaattg 1560 

gacttcccgg acattagctt taatcataag tggtccatca atgatttcaa tgccatcacc 1620 

gaggatcaag taaactccca aaaattggcc aaggctcaac aagccaccga tggcttgtct 1680 

caggaactcg ttagtctttt ggatccgttg ttagctcaac tatccgaatc cttccactac 1740 

catgcagcgt tttgtttcct gtatatgttt gtttgcctat gcccccatgc caagaatatt 1800 
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aagttttctt taaagtctac tttacccatc ggtgctgggt tgggctcaag cgcctctatt 18 60 

tctgtatcac tggccttagc tatggcctac ttgggggggt taataggatc taatgacttg 1920 

gaaaagctgt cagaaaacga taagcatata gtgaatcaat gggccttcat aggtgaaaag 1980 

tgtattcacg gtaccccttc aggaatagat aacgctgtgg ccacttatgg taatgccctg 2040 

ctatttgaaa aagactcaca taatggaaca ataaacacaa acaattttaa gttcttagat 2100 

gatttcccag ccattccaat gatcctaacc tatactagaa ttccaaggtc tacaaaagat 2160 

cttgttgctc gcgttcgtgt gttggtcacc gagaaatttc ctgaagttat gaagccaatt 2220 

ctagatgcca tgggtgaatg tgccctacaa ggcttagaga tcatgactaa gttaagtaaa 2280 

tgtaaaggca ccgatgacga ggctgtagaa actaataatg aactgtatga acaactattg 2340 

gaattgataa gaataaatca tggactgctt gtctcaatcg gtgtttctca tcctggatta 24 00 

gaacttatta aaaatctgag cgatgatttg agaattggct ccacaaaact taccggtgct 24 60 

ggtggcggcg gttgctcttt gactttgtta cgaagagaca ttactcaaga gcaaattgac 2520 

agcttcaaaa agaaattgca agatgatttt agttacgaga catttgaaac agacttgggt 2580 

gggactggct gctgtttgtt aagcgcaaaa aatttgaata aagatcttaa aatcaaatcc 2640 

ctagtattcc aattatttga aaataaaact accacaaagc aacaaattga cgatctatta 2700 

ttgccaggaa acacgaattt accatggact tcagacgagg agttttaatg actgtatata 2760 

ctgctagtgt aactgctccg gtaaatattg ctactcttaa gtattggggg aaaagggaca 2820 

cgaagttgaa tctgcccacc aattcgtcca tatcagtgac tttatcgcaa gatgacctca 2880 

gaacgttgac c'tctgcggct actgcacctg agtttgaacg cgacactttg tggttaaatg 2940 

gagaaccaca cagcatcgac aatgaaagaa ctcaaaattg tctgcgcgac ctacgccaat 3000 

taagaaagga aatggaatcg aaggacgcct cattgcccac attatctcaa tggaaactcc 3060 

acattgtctc cgaaaataac tttcctacag cagctggttt agcttcctcc gctgctggct 3120 

ttgctgcatt ggtctctgca attgctaagt tataccaatt accacagtca acttcagaaa 3180 

tatctagaat agcaagaaag gggtctggtt cagcttgtag atcgttgttt ggcggatacg 3240 

tggcctggga aatgggaaaa gctgaagatg gtcatgattc catggcagta caaatcgcag 3300 

acagctctga ctggcctcag atgaaagctt gtgtcctagt tgtcagcgat attaaaaagg 3360 

atgtgagttc cactcagggt atgcaattga ccgtggcaac ctccgaacta tttaaagaaa 3420 

gaattgaaca tgtcgtacca aagagatttg aagtcatgcg taaagccatt gttgaaaaag 34 80 

atttcgccac ctttgcaaag gaaacaatga tggattccaa ctctttccat gccacatgtt 3540 

tggactcttt ccctccaata ttctacatga atgacacttc caagcgtatc atcagttggt 3600 

gccacaccat taatcagttt tacggagaaa caatcgttgc atacacgttt gatgcaggtc 3660 

caaatgctgt gttgtactac ttagctgaaa atgagtcgaa actctttgca tttatctata 3720 
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aattotttoo ctctattcct aaataaaaca aaaaatttac 


faff rr a rrpan 
LaLLyayLdy 


P , "f"4-rrarT<-«r/-'4-4- 

l» l Ly dy y L,LT, 


J'oU 


tcaaccatca atttaaatra tctaacttta rt arscrii' na 


3 t~ t" Pf Pt.3 "t" c f" f* 
ex. u LyyaLLLL 


pra pr4" f- rtpa o r» 
ydy LLyLddd 


m a n 


a crcrat crt toe caoaotoatt ttaactraaa t poatt rapin 


ppparaa praa 
LLLctLaayad 


apaaappaaf 
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O Qpi P» 
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y LLuLdaCLC 
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aaaggegect 
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4 08 0 


t CTQCt" 3 3 PlPft t nra pra a f" t" pt pia ■f~prp , a +■ ppa a rrpra - f'+*4-4-/-Y-3 
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LyddaLLdLL 


tttggtaacg 


414 0 


ttctttctoc pa at t t* ctcinc paanrt 1 pppjn pparrar , aaaf 
*- *■» v- u w y u» uaa u u uyy yu uoayu^uuyy u.u>ay dL»ddy L 
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Lgct l uyy cl 


gccggcLiiga 


A O P* Pi 

4^00 


Gtaatcatat pprttripaanp arafit^aac a apTPTi~p , 4"-rT+- rrr~» 


afppnpt" -5 4- /-* 
dLCCyCLdLy 


aaggcaa tea 


4^ bO 


tttt Oaot ac t paat 1 prat" p aaa"t"pff"prprra a+ , rt , r , t*rra , f"rr+* 


ty LCy LayCL 


j^r 4» if ^<r4* 4*" #i^r 4* r** 


4 JzU 


aatctatcrac taacacacpa taptapatprp r , apir , arrr'r"T > rT 


LyL.yyyt.ycc 


a a — i 4« 4- 4- /-r /— r y— * 

add l l tggee 


4 JoO 


aaa Ct ot t pt t Pitt era trrnt n^fnaaarrap a +~ pf rrrrf" +- rr^ ^ 
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Cy d Ly Cy LdC 


/-v — * 4- /-T^r4 - ' /**»4 — pt 

ga LggLCLag 


/i / / p\ 
4 4 4 0 


CCa t Crnrr "t" nt* aparrrr'apraa aaprf*nt-prp , p»p' pl-paHrrncTa 
^^auyyy uyu aLaLyLayao ddy Ly Ly lol y Ly d L Ly y y a 


LaLLaCLayd 


gaacaacaag 


/I enp, 

4o00 


a C a 3 t*t* +*+* rr r* r , ?»"f~r , rra3i"p»r» fapraaapat" pfpaaaaaf n 


ccaaaaggaa 


ggtaaattcg 


4560 


a. L-d d l y ci cia l uguaccLyLii accatuaagg cjauttiacjagg 


taagectgat 


actcaagtca 


4620 


cy ciciy y dLya ygaaccLycL ayaccacacy LLgaaaaauti 


gagatctgea 


aggactgttt 


4680 
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t ccaatcaac 


gatggtgctg 


474 0 


P* ?5 Pf P* O 1 / 1 t" oaf" 4— +- /-t/-t4- 4~ 4~ r~* r*r ^ -r\ "\ ^ *-c4- 4-4— 4- — * — n _ _ — ^_ 

LdyLLyucdL cuuggLLT.cc gaaaaagLLL ugaaggaaaa 


gaatttgaag 


ccttxggcta 


4800 


j"T"aj"paaanp 4-4- i~r(^r^TPr"{ _ /-T-3 /-r ripprtp^nafp a 3 r»n ^ r*» 4- <-«t — » 
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yLddyddyyd 
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y y Lddydtcy 
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Otottaccoc catttotaat ootaataota CTtap , ttcr*tr* 

y o ^oyu v-. ci v_ i_ v_ y \- a. a u yyuyyuyyuy y uyu-u LuL> LL 


t attpft paff 

L d L L y L Ld L L 
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tggaagcaca 


tgatggagca 
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O^O 0 


acft a a 3 PfPTPT^ aa+" r*t~ at* "pPTPT3P , +""hrTPrp< paarraf f/^f f 
ay uoaayyya aauau>au>Lau LyyduL Lyyu tddydLLUtt 


4- a -,4- 4-4-4-4- ^ 

LayCL L l LLy 


cactgagctt 


R*3 / Pt 

Do4 0 


gaagatgtta tctctatgag tttcaatgcg gtgacatcac 


tttttgagaa 


gtataagatt 


5400 


gaccctaacc aaatcgggcg tcttgaagta ggaagtgaga 


ctgttattga 


caaaagcaag 


5460 


tccatcaaga ccttcttgat gcagctcttt gagaaatgtg 


gaaacactga 


tgtcgaaggt 


5520 


gttgactcga ecaatgettg ctatggtgga actgeagett 


tgttaaactg 


tgtcaattgg 


5580 


gttgagagta actcttggga tggacgttat ggcctcgtca 


tttgtactga 


cagegeggtt 


5640 



0210398A2 I > 



WO 02/10398 



PCT/US01/24037 



50 

tatgcagaag gacccgcaag gcccactgga ggagctgcag cgattgctat gttgatagga 5700 

cctgatgctc ctatcgtttt cgaaagcaaa ttgagagcaa gccacatggc tcatgtctat 57 60 

gacttttaca agcccaatct tgctagcgag tacccggttg ttgatggtaa gctttcacag 5820 

acttgctacc tcatggctct tgactcctgc tataaacatt tatgcaacaa gttcgagaag 5880 

atcgagggca aagagttctc cataaatgat gctgattaca ttgttttcca ttctccatac 5940 

aataaacttg tacagaaaag ctttgctcgt ctcttgtaca acgacttctt gagaaacgca 6000 

agctccattg acgaggctgc caaagaaaag ttcacccctt attcatcttt gacccttgac 6060 

gagagttacc aaagccgtga tcttgaaaag gtgtcacaac aaatttcgaa accgttttat 6120 

gatgctaaag tgcaaccaac gactttaata ccaaaggaag tcggtaacat gtacactgct 6180 

tc£ctctacg ctgcatttgc ttccctcatc cacaataaac acaatgattt ggcgggaaag 624 0 

cgggtggtta tgttctctta tggaagtggc tccaccgcaa caatgttctc attacgcctc 6300 

aacgacaata agcctccttt cagcatttca aacattgcat ctgtaatgga tgttggcggt 6360 

aaattgaaag ctagacatga gtatgcacct gagaagtttg tggagacaat gaagctaatg 6420 

gaacataggt atggagcaaa ggactttgtg acaaccaagg agggtattat agatcttttg 6480 . 

gcaccgggaa cttattatct gaaagaggtt gattccttgt accggagatt ctatggcaag 654 0 

aaaggtgaag atggatctgt agccaatgga cactgaggat ccgtcgactc gagcacgtga 6600 

ggaggcacat atgacggaaa cgcacgccat agccggggtc ccgatgaggt gggtgggacc 6660 

ccttcgtatt tccgggaacg tcgccgagac cgagacccag gtcccgctcg ccacgtacga 6720 

'gtcgccgctg tggccgtcgg tgggccgcgg ggcgaaggtc tcccggctga cggagaaggg 6780 

catcgtcgcc accctcgtcg acgagcggat gacccgctcg gtgatcgtcg aggcgacgga 6840' 

cgcgcagacc gcgtacatgg ccgcgcagac catccacgcc cgcatcgacg agctgcgcga; 6900 

ggtggtgcgc ggctgcagcc ggttcgccca gctgatcaac atcaagcacg agatcaacgc 6960 

gaacctgctg ttcatccggt tcgagttcac caccggtgac gcctccggcc acaacatggc 7020 

cacgctcgcc tccgatgtgc tcctggggca cctgctggag acgatccctg gcatctccta 7080 

cggctcgatc tccggcaact actgcacgga caagaaggcc accgcgatca acggcatcct 7140 

cggccgcggc aagaacgtga tcaccgagct gctggtgccg cgggacgtcg tcgagaacaa 7200 

cctgcacacc acggctgcca agatcgtcga gctgaacatc cgcaagaacc tgctcggcac 7260 

cctgctcgcc ggcggcatcc gctcggccaa cgcccacttc gcgaacatgc tgctcggctt 7320 

ctacctggcc accggccagg acgccgccaa catcgtcgag ggctcgcagg gcgtcgtcat 7380 

ggccgaggac cgcgacggcg acctctactt cgcctgcacc ctgccgaacc tgatcgtcgg 7440 

cacggtcggc aacggcaagg gtctcggctt cgtggagacg aacctcgccc ggctcggctg 7500 

ccgagccgac cgcgaacccg gggagaacgc ccgccgcctc gccgtcatcg cggcagcgac 7560 
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cgtgctgtgc ggtgaactct cgctgctcgc ggcacagacg aacccgggcg aactcatgcg 7620 

cgcgcacgtc cagctggaac gcgacaacaa gaccgcaaag gttggtgcat agacgcgtgc 7680 

9 7681 

<210> 62 

<211> 8224 

<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Operon E containing A. thaliana, S. cerevesiae, Steptomyces sp CL190 
DNA, 

and R. capsulatus 



<400> 62 
ggccgcgtcg 


actacggccg 


caggaggagt 


tcatatgtca 


gagttgagag ccttcagtgc 


60 


cccagggaaa 


gcgttactag 


ctggtggata 


tttagtttta 


gatacaaaat atgaagcatt 


120 


tgtagtcgga 


ttatcggcaa 


gaatgcatgc 


tgtagcccat 


ccttacggtt cattgcaagg 


180 


gtctgataag 


tttgaagtgc 


gtgtgaaaag 


taaacaattt 


aaagatgggg agtggctgta 


240 


ccatataagt 


cctaaaagtg 


gcttcattcc 


tgtttcgata 


ggcggatcta agaacccttt 


300 


cattgaaaaa 


gttatcgcta 


acgtatttag 


ctactttaaa 


cctaacatgg acgactactg 


360 


caatagaaac 


ttgttcgtta 


ttgatatttt 


ctctgatgat 


gcctaccatt ctcaggagga • 


420 


tagcgttacc 


gaacatcgtg 


gcaacagaag 


attgagtttt 


cattcgcaca gaattgaaga • 


480 


agttcccaaa 


acagggctgg 


gctcctcggc 


aggtttagtc 


acagttttaa ctacagcttt 


540 


ggcctccttt 


tttgtatcgg 


acctggaaaa 


taatgtagac 


aaatatagag aagttattca 


600 


taatttagca 


caagttgctc 


attgtcaagc 


tcagggtaaa 


attggaagcg ggtttgatgt 


660 


agcggcggca 


gcatatggat 


ctatcagata 


tagaagattc 


ccacccgcat taatctctaa 


720 


tttgccagat 


attggaagtg 


ctacttacgg 


cagtaaactg. 


gcgcatttgg ttgatgaaga . 


780 


agactggaat 


attacgatta 


aaagtaacca 


tttaccttcg 


ggattaactt tatggatggg 


. 840 


cgatattaag 


aatggttcag 


aaacagtaaa 


actggtccag 


aaggtaaaaa attggtatga 


900 


ttcgcatatg 


ccagaaagct 


tgaaaatata 


tacagaactc 


gatcatgcaa attctagatt 


960 


tatggatgga 


ctatctaaac 


tagatcgctt 


acacgagact 


catgacgatt acagcgatca 


1020 


gatatttgag 


tctcttgaga 


ggaatgactg 


tacctgtcaa 


aagtatcctg aaatcacaga 


1080 


agttagagat 


gcagttgcca 


caattagacg 


ttcctttaga 


aaaataacta aagaatctgg 


1140 


tgccgatatc 


gaacctcccg 


tacaaactag 


cttattggat 


gattgccaga ccttaaaagg 


1200 
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agttcttact 


tgcttaatac 


ctggtgctgg 


tggttatgac 


gccattgcag tgattactaa 


1260 


gcaagatgtt 


gatcttaggg 


ctcaaaccgc 


taatgacaaa 


agattttcta aggttcaatg 


1320 


gctggatgta 


actcaggctg 


actggggtgt 


taggaaagaa 


aaagatccgg aaacttatct 


1380 


tgataaactg 


caggaggagt 


tttaatgtca 


ttaccgttct 


taacttctgc accgggaaag 


1440 


gttattattt 


ttggtgaaca 


ctctgctgtg 


tacaacaagc 


ctgccgtcgc tgctagtgtg 


1500 


tctgcgttga 


gaacctacct 


gctaataagc 


gagtcatctg 


caccagatac tattgaattg 


1560 


gacttcccgg 


acattagctt 


taatcataag 


tggtccatca 


atgatttcaa tgccatcacc 


1620 


gaggatcaag 


taaactccca 


aaaattggcc 


aaggctcaac 


aagccaccga tggcttgtct 


1680 


caggaactcg 


ttagtctttt 


ggatccgttg 


ttagctcaac 


tatccgaatc cttccactac 


1740 


catgcagcgt 


tttgtttcct 


gtatatgttt 


gtttgcctat 


gcccccatgc caagaatatt . 


1800 


aagttttctt 


taaagtctac 


tttacccatc 


ggtgctgggt 


tgggctcaag cgcctctatt 


1860 


tctgtatcac 


tggccttagc 


tatggcctac 


ttgggggggt 


taataggatc taatgacttg 


1920 


gaaaagctgt 


cagaaaacga 


taagcatata 


gtgaatcaat 


gggccttcat aggtgaaaag 


1980 


tgtattcacg 


gtaccccttc 


aggaatagat 


aacgctgtgg 


ccacttatgg taatgccctg 


2040 


ctatttgaaa 


aagactcaca 


taatggaaca 


ataaacacaa 


acaattttaa gttcttagat 


2100 


gatttcccag 


ccattccaat 


gatcctaacc 


tatactagaa 


ttccaaggtc tacaaaagat 


2160 


cttgttgctc 


gcgttcgtgt 


gttggtcacc 


gagaaatttc 


ctgaagttat gaagccaatt 


2220 


ctagatgcca 


tgggtgaatg 


tgccctacaa 


ggcttagaga 


tcatgactaa gttaagtaaa 


2280 


tgtaaaggca 


ccgatgacga 


ggctgtagaa 


actaataatg 


aactgtatga acaactattg 


2340 


gaattgataa 


gaataaatca 


tggactgctt 


gtctcaatcg 


gtgtttctca tcctggatta 


2400 


gaacttatta 


aaaatctgag 


cgatgatttg 


agaattggct 


ccacaaaact taccggtgct 


2460 


ggtggcggcg 


gttgctcttt 


gactttgtta 


cgaagagaca 


ttactcaaga gcaaattgac 


2520 


agcttcaaaa 


agaaattgca 


agatgatttt 


agttacgaga 


catttgaaac agacttgggt 


2580 


gggactggct 


gctgtttgtt 


aagcgcaaaa 


aatttgaata 


aagatcttaa aatcaaatcc 


2640 


ctagtattcc 


-aattatttga 


aaataaaact 


accacaaagc 


aacaaattga cgatctatta 


2700 


ttgccaggaa 


acacgaattt 


accatggact 


tcagacgagg 


agttttaatg actgtatata 


2760 


ctgctagtgt 


aactgctccg 


gtaaatattg 


ctactcttaa 


gtattggggg aaaagggaca 


2820 


cgaagttgaa 


tctgcccacc 


aattcgtcca 


tatcagtgac 


tttatcgcaa gatgacctca 


2880 


gaacgttgac 


ctctgcggct 


actgcacctg 


agtttgaacg 


cgacactttg tggttaaatg 


294 0 


gagaaccaca 


cagcatcgac 


aatgaaagaa 


ctcaaaattg 


tctgcgcgac ctacgccaat 


3000 


taagaaagga 


aatggaatcg 


aaggacgcct 


cattgcccac 


attatctcaa tggaaactcc 


3060 


acattgtctc 


cgaaaataac 


tttcctacag 


cagctggttt 


agcttcctcc gctgctggct 


3120 
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ttgctgcatt 


ggtctctgca 


attgctaagt 


tataccaatt accacagtca 


acttcagaaa 


3180 


tatctagaat 


agcaagaaag 


gggtctggtt 


cagcttgtag atcgttgttt 


ggcggatacg 


3240 


tggcctggga 


aatgggaaaa 


gctgaagatg 


gtcatgattc catggcagta 


caaatcgcag 


3300 


acagctctga 


ctggcctcag 


atgaaagctt 


gtgtcctagt tgtcagcgat 


attaaaaagg 


3360 


atgtgagttc 


cactcagggt 


atgcaattga 


ccgtggcaac ctccgaacta 


tttaaagaaa 


3420 


gaattgaaca 


tgtcgtacca aagagatttg 


aagtcatgcg taaagccatt 


gttgaaaaag 


3480 


atttcgccac 


ctttgcaaag gaaacaatga 


tggattccaa ctctttccat 


gccacatgtt 


3540 


tggactcttt 


ccctccaata ttctacatga 


atgacacttc caagcgtatc 


atcagttggt 


3600 


gccacaccat 


taatcagttt 


tacggagaaa 


caatcgttgc atacacgttt 


gatgcaggtc 


3660 


caaatgctgt 


gttgtactac ttagctgaaa 


atgagtcgaa actctttgca 


tttatctata 


3720 


aattgtttgg 


ctctgttcct 


ggatgggaca 


agaaatttac tactgagcag 


cttgaggctt 


3780 


tcaaccatca 


atttgaatca 


tctaacttta 


ctgcacgtga attggatctt 


gagttgcaaa 


3840 


aggatgttgc 


cagagtgatt 


ttaactcaag 


tcggttcagg cccacaagaa 


acaaacgaat 


3900 


ctttgattga 


cgcaaagact 


ggtctaccaa 


aggaagagga gttttaactc 


gagtaggagg 


3960 


cacatatgtc 


tcagaacgtt 


tacattgtat 


cgactgccag aaccccaatt 


ggttcattcc 


4020 


agggttctct 


atcctccaag 


acagcagtgg 


aattgggtgc tgttgcttta 


aaaggcgcct 


. 4080 


tggctaaggt 


tccagaattg 


gatgcatcca 


aggattttga cgaaattatt 


tttggtaacg 


4140 


ttctttctgc 


caatttgggc 


caagctccgg 


ccagacaagt tgctttggct 


gccggtttga 


4200 


gtaatcatat 


cgttgcaagc 


acagttaaca 


aggtctgtgc atccgctatg 


aaggcaatca 


.4260 


ttttgggtgc 


tcaatccatc 


aaatgtggta 


atgctgatgt tgtcgtagct 


ggtggttgtg 


4320 


aatctatgac 


taacgcacca tactacatgc 


cagcagcccg tgcgggtgcc 


aaatttggcc 


4380 


aaactgttct 


tgttgatggt 


gtcgaaagag 


atgggttgaa cgatgcgtac 


gatggtctag 


4440 


ccatgggtgt 


acacgcagaa 


aagtgtgccc 


gtgattggga tattactaga 


gaacaacaag 


4500 


acaattttgc 


catcgaatcc 


taccaaaaat 


ctcaaaaatc tcaaaaggaa 


ggtaaattcg 


4560 


acaatgaaat 


tgtacctgtt 


accattaagg 


gatttagagg taagcctgat 


actcaagtca 


4620 


cgaaggacga 


ggaacctgct 


agattacacg 


ttgaaaaatt gagatctgca 


aggactgttt 


4680 


tccaaaaaga 


aaacggtact 


gttactgccg 


ctaacgcttc tccaatcaac 


gatggtgctg 


4740 


cagccgtcat 


cttggtttcc 


gaaaaagttt 


tgaaggaaaa gaatttgaag 


cctttggcta 


4800 


ttatcaaagg 


ttggggtgag gccgctcatc 


aaccagctga . ttttacatgg 


gctccatctc 


4860 


ttgcagttcc 


aaaggctttg 


aaacatgctg 


gcatcgaaga catcaattct 


gttgattact 


4920 


ttgaattcaa 


tgaagccttt tcggttgt.cg 


gtttggtgaa cactaagatt 


ttgaagctag 


4980 


acccatctaa 


ggttaatgta tatggtggtg 


ctgttgctct aggtcaccca 


ttgggttgtt 


5040 
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ctggtgctag 


agtggttgtt 


acactgctat ccatcttaca gcaagaagga 


ggtaagatcg 


5100 


gtgttgccgc 


catttgtaat 


ggtggtggtg gtgcttcctc tattgtcatt 


gaaaagatat 


5160 


gaggatcctc 


tagatgcgca 


ggaggcacat atggcgaaga acgttgggat 


tttggctatg 


5220 


gatatctatt 


tccctcccac 


ctgtgttcaa caggaagctt tggaagcaca 


tgatggagca 


5280 


agtaaaggga 


aatacactat 


tggacttggc caagattgtt tagctttttg 


cactgagctt 


5340 


gaagatgtta 


tctctatgag 


tttcaatgcg gtgacatcac tttttgagaa 


gtataagatt 


5400 


gaccctaacc 


aaatcgggcg 


tcttgaagta ggaagtgaga ctgttattga 


caaaagcaag 


5460 


tccatcaaga 


ccttcttgat 


gcagctcttt gagaaatgtg gaaacactga 


tgtcgaaggt 


5520 


gttgactcga 


ccaatgcttg 


ctatggtgga actgcagctt tgttaaactg 


tgtcaattgg 


5580 


gttgagagta 


actcttggga 


tggacgttat ggcctcgtca tttgtactga 


cagcgcggtt 


5640 


tatgcagaag 


gacccgcaag 


gcccactgga ggagctgcag cgattgctat 


gttgatagga 


5700 


cctgatgctc 


ctatcgtttt 


cgaaagcaaa ttgagagcaa gccacatggc 


tcatgtctat 


5760 


gacttttaca 


agcccaatct 


tgctagcgag tacccggttg ttgatggtaa 


gctttcacag 


5820 


acttgctacc 


tcatggctct 


tgactcctgc tataaacatt tatgcaacaa 


gttcgagaag 


5880 


atcgagggca 


aagagttctc 


cataaatgat gctgattaca ttgttttcca 


ttctccatac 


5940 


aataaacttg 


tacagaaaag 


ctttgctcgt ctcttgtaca acgacttctt 


gagaaacgca 


6000 


agctccattg 


acgaggctgc 


caaagaaaag ttcacccctt attcatcttt 


gacccttgac 


6060 


gagagttacc 


aaagccgtga 


tcttgaaaag gtgtcacaac aaatttcgaa 


accgttttat 


6120 


gatgctaaag 


tgcaaccaac 


gactttaata ccaaaggaag tcggtaacat 


gtacactgct 


6180 


tctctctacg 


ctgcatttgc 


ttccctcatc cacaataaac acaatgattt 


ggcgggaaag 


624 0 


cgggtggtta 


tgttctctta 


tggaagtggc tccaccgcaa caatgttctc 


attacgcctc 


6300 


aacgacaata 


agcctccttt 


cagcatttca aacattgcat ctgtaatgga 


tgttggcggt 


6360 


aaattgaaag 


ctagaca'tga 


gtatgcacct gagaagtttg tggagacaat 


gaagctaatg 


6420 


gaacataggt 


atggagcaaa 


ggactttgtg acaaccaagg agggtattat 


agatcttttg 


6480 


gcaccgggaa 


cttattatct 


gaaagaggtt gattccttgt accggagatt 


ctatggcaag 


6540 


aaaggtgaag 


atggatctgt 


agccaatgga cactgaggat ccgtcgactc 


gagcacgtga 


6600 


ggaggcacat 


atgacggaaa 


cgcacgccat agccggggtc ccgatgaggt 


gggtgggacc 


6660 


ccttcgtatt 


tccgggaacg 


tcgccgagac cgagacccag gtcccgctcg 


ccacgtacga 


6720 


gtcgccgctg 


tggccgtcgg 


tgggccgcgg ggcgaaggtc tcccggctga 


cggagaaggg 


6780 


catcgtcgcc 


accctcgtcg 


acgagcggat gacccgctcg gtgatcgtcg 


aggcgacgga 


6840 


cgcgcagacc 


gcgtacatgg 


ccgcgcagac catccacgcc cgcatcgacg 


agctgcgcga 


6900 


ggtggtgcgc 


ggctgcagcc 


ggttcgccca gctgatcaac atcaagcacg 


agatcaacgc 


6960 
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gaacctgctg 


ttcatccggt 


tcgagttcac 


caccGCtaac occt^ccaocc acaacatarrr 


t \J£.\J 


cacgctcgcc 


tccqatqtqc 


tcctqqqqca 

3333 


ectgetggag acgatccctg gcatctccta 


/ \J O KJ 


cggctcgatc 


tccggcaact 


actgeaegga 


caaaaaaacc acccrcaatca acaacatcci- 




cggccgcggc 


aagaacgtga 


tcaccgagct 


actaatacca caaaacatca tcaaaaar^a 


1900 


cctgcacacc 


acggctgcca 


agategtega 


gctgaacatc cgcaagaacc tgctcggcac 


I960 


cctgctcgcc 


qqcaacat cc 

3 3^33 ww 


get eggecaa 


cacccacttc crpaaafaffir f Apf r , rrrrr , -b't- 


7-300 


ctacctggcc 


acccfaccaaa 


acgccgccaa 


catcatpflacr rrnri" r^rrr^P nrr nrrt^~ rri^- r>z*i~ 
y— 3 3 3 yyv*LoyL>a,yu y v_. y i_v y L. L» d L 


' joU 


qqccqaqqac 

3 3 3 J 3 




acctctact t 


uyuuuy^auu LLy^uyaa^o uyaiuy LV/yy 


/ 4 3 U 


cacqqtcqqc 

33 33 


aacaacaaaa 


gtct eggett 


cntflnacfarci aarrt*rcfppp rrrrr , 'f"r , rTrrp , -f-rt 


7 Rnn 


ccgagccgac 


cgcgaacccg 


QQcraaaacac 


npQPPPjppfp rrpprr1"Paf pa orrrrr 1 cirri r' 
v^uyv-uyc-^uv^, yuuy n»QL^y ^yyL>ctyoyaO 


7*^ 


cqtqctatqc 

3 3 3 ^ 3 


ggtgaactct 


cactact cac 

W >^ V** w VJ W W W W 


yyua^ayauy aauLuyyy^y datLUdLyCy 




cgcgcacgtc 


cagctggaac 


gcgacaacaa 


cmccciCr\rlr\Ci rrt t riPrt rrr*^ 1~ arrar^rrnrrrfl-a 


/ DO U 


aqqaqqcaca 


tatgagtgag 


ct tat accca 


c ct PffTPri" +" rrrr frrapapaftrr rrnf nrrrrrf rrrr 
y y u L.yy uy ctL-ciy en- yL-LUCyyugg 




acaagttgga 


ggtgcatttg 


aaaaaact cc 

w4 U U. VI M l» V«r 


yuua^aayy^ yy iy LUiyLi l LL»y LL»d Ly y 


7 onn 


atggcgaaaa 


cgt get gate 


cacf core ccrct 


^yy a yy "3°° dUcn.t-ciL.n_iL. cccyyycLLL 


/oDU 


gggcgaacac 


ct act acacc 


faf r , r , rrrrrrr*t 


yy ctLLyddcg ccccgaggaa Lgcgcggcgc 




qqcqqctqca 


cgaggagct g 


aaaat caeca 


Pf Pf P* 1 1" 't' 3 4" O O pfTpp f^s, 4~ fro 0 rtapprrrrpf , rtrr 
yyuuLuaLu^ i_y LLLa Ly L.L. y aLLy y l. Lyy 




aatatcgcgc 


egatgtegge 


ggcggcatga 


tcgagcatga ggtggtcgac atctatctgg 


8040 


cctatgccaa 


acegcatatg 


cggatcaccc 


ccgatccgcg cgaagtggcc gaggtgeget 


8100 


ggatcggcct 


ttacgatctg 


geggecgagg 


ccggtcggca tcccgagcgg ttctcgaaat 


8160 


ggctcaacat 


ctatctgtcg 


agecatcttg 


aceggatttt eggatcgate ctgcgcggct 


8220 


gage 








8224 



<210> 63 
<211> 8077 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Operon F containing A. thaliana, S. cerevisiae, and Streptomyces sp 
CL190 

DNA ' 
<400> 63 

ccaccgcggc ggccgcgtcg acgccggcgg aggcacatat gtctcagaac gtttacattg 60 
tatcgactgc cagaacccca attggttcat tccagggttc tctatcctcc aagacagcag 120 
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tggaattggg 


tgctgttgct 


ttaaaaggcg 


ccttggctaa 


ggttccagaa 


ttggatgcat 


180 


ccaaggattt 


tgacgaaatt 


atttttggta 


acgttctttc 


tgccaatttg 


ggccaagctc 


240 


cggccagaca 


agttgctttg 


gctgccggtt 


tgagtaatca 


tatcgttgca 


agcacagtta 


300 


acaaggtctg 


tgcatccgct 


atgaaggcaa 


tcattttggg 


tgctcaatcc 


atcaaatgtg 


360 


gtaatgctga 


tgttgtcgta 


gctggtggtt 


gtgaatctat 


gactaacgca 


ccatactaca 


420 


tgccagcagc 


ccgtgcgggt 


gccaaatttg 


gccaaactgt 


tcttgttgat 


ggtgtcgaaa 


480 


gagatgggtt 


gaacgatgcg 


tacgatggtc 


tagccatggg 


tgtacacgca 


gaaaagtgtg 


540 


cccgtgattg 


ggatattact 


agagaacaac 


aagacaattt 


tgccatcgaa 


tcctaccaaa 


600 


aatctcaaaa 


atctcaaaag 


gaaggtaaat 


tcgacaatga 


aattgtacct 


gttaccatta 


660 


agggatttag 


aggtaagcct 


gatactcaag 


tcacgaagga 


cgaggaacct 


gctagattac 


720 


acgttgaaaa 


attgagatct 


gcaaggactg 


ttttccaaaa 


agaaaacggt 


actgttactg 


780 


ccgctaacgc 


ttctccaatc 


aacgatggtg 


ctgcagccgt 


catcttggtt 


tccgaaaaag 


840 


ttttgaagga 


aaagaatttg 


aagcctttgg 


ctattatcaa 


aggttggggt. 


gaggccgctc 


900 


atcaaccagc 


tgattttaca 


tgggctccat 


ctcttgcagt 


tccaaaggct 


ttgaaacatg 


960 


ctggcatcga 


agacatcaat 


tctgttgatt 


actttgaatt 


caatgaagcc 


ttttcggttg 


. 1020 


tcggtttggt 


gaacactaag 


attttgaagc 


tagacccatc 


taaggttaat 


gtatatggtg 


1080 


gtgctgttgc 


tctaggtcac 


ccattgggtt 


gttctggtgc 


tagagtggtt 


gttacactgc 


114 0 


tatccatctt 


acagcaagaa 


ggaggtaaga 


tcggtgttgc 


cgccatttgt 


aatggtggtg 


1200 


gtggtgcttc 


ctctattgtc 


attgaaaaga 


tatgaggatc 


ctctaggtac 


ttccctggcg 


1260 


tgtgcagcgg 


ttgacgcgcc 


gtgccctcgc 


tgcgagcggc 


gcgcacatct 


gacgtcctgc 


1320 


tttattgctt 


tctcagaact 


cgggacgaag 


cgatcccatg 


atcacgcgat 


ctccatgcag 


1380 


aaaagacaaa 


gggagctgag 


tgcgttgaca 


ctaccgacct 


cggctgaggg 


ggtatcagaa 


144.0 


agccaccggg 


cccgctcggt 


cggcatcggt 


cgcgcccacg 


ccaaggccat 


cctgctggga 


1500 


gagcatgcgg 


tcgtctacgg 


agcgccggca 


ctcgctctgc 


cgattccgca 


gctcacggtc 


1560 


acggccagcg 


tcggctggtc 


gtccgaggcc 


tccgacagtg 


cgggtggcct 


gtcctacacg 


1620 


atgaccggta 


cgccgtcgcg 


ggcactggtg 


acgcaggcct 


ccgacggcct 


gcaccggctc 


1680 


accgcggaat 


tcatggcgcg 


gatgggcgtg 


acgaacgcgc 


cgcacctcga 


cgtgatcctg 


1740 


gacggcgcga 


tcccgcacgg 


ccggggtctc 


ggctccagcg 


cggccggctc 


acgcgcgatc 


1800 


gccttggccc 


tcgccgacct 


cttcggccac 


gaactggccg 


agcacacggc 


gtacgaactg 


.1860 


gtgcagacgg 


ccgagaacat 


ggcgcacggc 


cgggccagcg 


gcgtggacgc 


gatgacggtc 


1920 


ggcgcgtccc 


ggccgctgct 


gttccagcag 


ggccgcaccg 


agcgactggc 


catcggctgc 


1980 


gacagcctgt 


tcatcgtcgc 


cgacagcggc 


gtcccgggca 


gcaccaagga 


agcggtcgag 


20.40 
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atgctgcggg 


agggattcac 


ccgcagcgcc 


ggaacacagg 


agcggttcgt 


cggccgggcg 


2100 


acggaactga 


ccgaggccgc 


ccggcaggcc 


ctcgccgacg 


gccggcccga 


ggagctgggc 


2160 


tcgcagctga 


cgtactacca 


cgagctgctc 


catgaggccc 


gcctgagcac 


cgacggcatc 


2220 


gatgcgctgg 


tcgaggccgc 


gctgaaggca 


ggcagcctcg 


gagccaagat 


caccggcggt 


2280 


ggtctgggcg 


gctgcatgat 


cgcacaggcc 


cggcccgaac 


aggcccggga 


ggtcacccgg 


2340 


cagctccacg 


aggccggtgc 


cgtacagacc 


tgggtcgtac 


cgctgaaagg 


gctcgacaac 


2400 


catgcgcagt 


gaacacccga 


ccacgaccgt 


gctccagtcg 


cgggagcagg 


gcagcgcggc 


2460 


cggcgccacc 


gcggtcgcgc 


acccaaacat 


cgcgctgatc 


aagtactggg 


gcaagcgcga 


2520 


cgagcggctg 


atcctgccct 


gcaccaccag 


cctgtcgatg 


acgctggacg 


tcttccccac 


2580 


gaccaccgag 


gtccggctcg 


accccgccgc 


cgagcacgac 


acggccgccc 


tcaacggcga 


2640 


ggtggccacg 


ggcgagacgc 


tgcgccgcat 


cagcgccttc 


ctctccctgg 


tgcgggaggt 


2700 


ggcgggcagc 


gaccagcggg 


ccgtggtgga 


cacccgcaac 


accgtgccca 


ccggggcggg 


2760 


cctggcgtcc 


tccgccagcg 


ggttcgccgc 


cctcgccgtc 


gcggccgcgg 


ccgcctacgg 


2820 


gctcgaactc 


gacgaccgcg 


ggctgtcccg 


gctggcccga 


cgtggatccg 


gctccgcctc 


2880 


gcggtcgatc 


ttcggcggct 


tcgccgtctg 


gcacgccggc 


cccgacggca 


cggccacgga 


2940 


agcggacctc 


ggctcctacg 


ccgagccggt 


gcccgcggcc 


gacctcgacc 


cggcgctggt 


3000 


catcgccgtg 


gtcaacgccg 


gccccaagcc 


cgtctccagc 


cgcgaggcca 


tgcgccgcac 


3060 


cgtcgacacc 


tcgccgctgt 


accggccgtg 


ggccgactcc 


agtaaggacg 


acct-ggacga 


3120 


gatgcgctcg 


gcgctgctgc 


gcggcgacct 


cgaggccgtg 


ggcgagatcg 


cggagcgcaa 


3180 


cgcgctcggc 


atgcacgcca 


ccatgctggc 


cgcccgcccc 


gcggtgcggt 


acctgtcgcc 


3240 


ggccacggtc 


accgtgctcg 


acagcgtgct 


ccagctccgc 


aaggacggtg 


tcctggccta 


3300 


cgcgaccatg 


gacgccggtc 


ccaacgtgaa 


ggtgctgtgc 


cggcgggcgg 


acgccgagcg 


3360 


ggtggccgac 


gtcgtacgcg 


ccgccgcgtc 


cggcggtcag 


gtcctcgtcg 


ccgggccggg 


3420 


agacggtgcc 


cgcctgctga 


gcgagggcgc 


atgacgacag 


gtcagcgcac 


gatcgtccgg 


3480 


cacgcgccgg 


gcaagctgtt 


cgtcgcgggc 


gagtacgcgg 


tcgtggatcc 


gggcaacccg 


3540 


gcgatcctgg 


tagcggtcga 


ccggcacatc 


agcgtcaccg 


tgtccgacgc 


cgacgcggac 


3600 


accggggccg 


ccgacgtcgt 


gatctcctcc 


gacctcggtc 


cgcaggcggt 


cggctggcgc 


3660 


tggcacgacg 


gccggctcgt 


cgtccgcgac 


ccggacgacg 


ggcagcaggc 


gcgcagcgcc 


3720 


ctggcccacg 


tggtgtcggc 


gatcgagacc 


gtgggccggc 


tgctgggcga 


acgcggacag 


3780 


aaggtccccg 


ctctcaccct 


ctccgtcagc 


agccgcctgc 


acgaggacgg 


ccggaagttc 


3840 


ggcctgggct 


ccagcggcgc 


ggtgaccgtg 


gcgaccgtag 


ccgccgtcgc 


cgcgttctgc 


3900 


ggactcgaac 


tgtccaccga 


cgaacggttc 


cggctggcca 


tgctcgccac 


cgcggaactc 


. 3960 
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gaccccaagg 


gctccggcgg 


ggacctcgcc gccagcacct ggggcggctg 


gatcgcctac 


4020 


caggcgcccg 


accgggcctt 


tgtgctcgac ctggcccggc gcgtgggagt 


cgaccggaca 


4080 


ctgaaggcgc 


cctggccggg 


gcactcggtg cgccgactgc cggcgcccaa 


gggcctcacc 


4140 


ctggaggtcg 


gctggaccgg 


agagcccgcc tccaccgcgt ccctggtgtc 


cgatctgcac 


4200 


cgccgcacct 


ggcggggcag 


cgcctcccac cagaggttcg tcgagaccac 


gaccgactgt 


4260 


gtccgctccg 


cggtcaccgc 


cctggagtcc ggcgacgaca cgagcctgct 


gcacgagatc 


4320 


cgccgggccc 


gccaggagct 


ggcccgcctg gacgacgagg tcggcctcgg 


catcttcaca 


4380 


cccaagctga 


cggcgctgtg 


cgacgccgcc gaagccgtcg gcggcgcggc 


caagccctcc 


4440 


ggggcaggcg 


gcggcgactg 


cggcatcgcc ctgctggacg ccgaggcgtc 


gcgggacatc 


4500 


acacatgtac 


ggcaacggtg 


ggagacagcc ggggtgctgc ccctgcccct 


gactcctgcc 


4560 


ctggaaggga 


tctaagaatg 


accagcgccc aacgcaagga cgaccacgta 


cggctcgcca 


4620 


tcgagcagca 


caacgcccac 


agcggacgca accagttcga cgacgtgtcg 


ttcgtccacc 


4680 


acgccctggc 


cggcatcgac 


cggccggacg tgtccctggc cacgtccttc 


gccgggatct 


4740 


cctggcaggt 


gccgatctac 


atcaacgcga tgaccggcgg cagcgagaag 


accggcctca 


4800 


tcaaccggga 


cctggccacc 


gccgcccgcg agaccggcgt ccccatcgcg 


tccgggtcca 


4860 


tgaacgcgta 


catcaaggac 


ccctcctgcg ccgacacgtt ccgtgtgctg 


cgcgacgaga 


4920 


accccaacgg 


gttcgtcatc 


gcgaacatca acgccaccac gacggtcgac 


aacgcgcagc 


4980 


gcgcgatcga 


cctgatcgag 


gcgaacgccc tgcagatcca catcaacacg 


gcgcaggaga 


5040 


cgccgatgcc 


ggagggcgac 


cggtcgttcg cgtcctgggt cccgcagatc 


gagaagatcg 


5100 


cggcggccgt 


cgacatcccc 


gtgatcgtca aggaggtcgg caacggcctg 


agccggcaga 


5160 


ccatcctgct 


gctcgccgac 


ctcggcgtgc aggcggcgga cgtcagcggc 


cgcggcggca 


5220 


cggacttcgc 


ccgcatcgag 


aacggccgcc gggagctcgg cgactacgcg 


ttcctgcacg 


5280 


gctgggggca 


gtccaccgcc 


gcctgcctgc tggacgccca ggacatctcc 


ctgcccgtcc 


5340 


tcgcctccgg 


cggtgtgcgt 


cacccgctcg acgtggtccg cgccctcgcg 


ctcggcgccc 


5400 


gcgccgtcgg 


ctcctccgcc 


ggcttcctgc gcaccctgat ggacgacggc 


gtcgacgcgc 


54 60 


tgatcacgaa 


gctcacgacc 


tggctggacc agctggcggc gctgcagacc 


atgctcggcg 


5520 


cgcgcacccc 


ggccgacctc 


acccgctgcg acgtgctgct ccacggcgag 


ctgcgtgact 


5580 


tctgcgccga 


ccggggcatc 


gacacgcgcc gcctcgccca gcgctccagc 


tccatcgagg 


5640 


ccctccagac 


gacgggaagc 


acacgatgac ggaaacgcac gccatagccg 


gggtcccgat 


5700 


gaggtgggtg 


ggaccccttc 


gtatttccgg gaacgtcgcc gagaccgaga 


cccaggtccc 


5760 


gctcgccacg 


tacgagtcgc 


cgctgtggcc gtcggtgggc cgcggggcga 


aggtctcccg 


5820 


gctgacggag 


aagggcatcg 


tcgccaccct" cgtcgacgag cggatgaccc 


gctcggtgat 


5880 
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cgtcgaggcg 


acggacgcgc 


agaccgcgta 


catggccgcg 


cagaccatcc 


acgcccgcat 


5940 


cgacgagctg 


cgcgaggtgg 


tgcgcggctg 


cagccggttc 


gcccagctga 


tcaacatcaa 


6000 


gcacgagatc 


aacgcgaacc 


tgctgttcat 


ccggttcgag 


ttcaccaccg 


gtgacgcctc 


6060 


cggccacaac 


atggccacgc 


tcgcctccga 


tgtgctcctg 


gggcacctgc 


tggagacgat 


6120 


ccctggcatc 


tcctacggct 


cgatctccgg 


caactactgc 


acggacaaga 


aggccaccgc 


6180 


gatcaacggc 


atcctcggcc 


gcggcaagaa 


cgtgatcacc 


gagctgctgg 


tgccgcggga 


6240 


cgtcgtcgag 


aacaacctgc 


acaccacggc 


tgccaagatc 


gtcgagctga 


acatccgcaa 


6300 


gaacctgctc 


ggcaccctgc 


tcgccggcgg 


catccgctcg 


gccaacgccc 


acttcgcgaa 


6360 


catgctgctc 


ggcttctacc 


tggccaccgg 


ccaggacgcc 


gccaacatcg 


tcgagggctc 


6420 


gcagggcgtc 


gtcatggccg 


aggaccgcga 


cggcgacctc 


tacttcgcct 


gcaccctgcc 


6480 


gaacctgatc 


gtcggcacgg 


tcggcaacgg 


caagggtctc 


ggcttcgtgg 


agacgaacct 


6540 


cgcccggctc 


ggctgccgag 


ccgaccgcga 


acccggggag 


aacgcccgcc 


gcctcgccgt 


6600 


catcgcggca 


gcgaccgtgc 


tgtgcggtga 


actctcgctg 


ctcgcggcac 


agacgaaccc 


6660 


gggcgaactc 


atgcgcgcgc 


acgtccagct 


ggaacgcgac 


aacaagaccg 


caaaggttgg 


6720 


tgcatagggc 


atgtccatct 


ccataggcat 


tcacgacctg 


tcgttcgcca 


caaccgagtt 


6780 


cgtcctgccg 


cacacggcgc 


tcgccgagta 


caacggcacc 


gagatcggca 


agtaccacgt 


6840 


cggcatcggc 


cagcagtcga 


tgagcgtgcc 


ggccgccgac 


gaggacatcg 


tgaccatggc 


6900 


cgcgaccgcg 


gcgcggccca 


tcatcgagcg 


caacggcaag 


agccggatcc 


gcacggtcgt 


6960 


gttcgccacg 


gagtcgtcga 


tcgaccaggc 


gaaggcgggc 


ggcgtgtacg 


tgcactccct 


7020 


gctggggctg 


gagtcggcct 


gccgggtcgt 


cgagctgaag 


caggcctgct 


acggggccac 


7080 


cgccgccctt 


cagttcgcca 


tcggcctggt 


gcggcgcgac 


cccgcccagc 


aggtcctggt 


7140 


catcgccagt 


gacgtctcca 


agtacgagct 


ggacagcccc 


ggcgaggcga 


cccagggcgc 


7200 


ggccgcggtg 


gccatgctgg 


tcggcgccga 


cccggccctg 


ctgcgtatcg 


aggagccgtc 


7260 


gggcctgttc 


accgccgacg 


tcatggactt 


ctggcggccc 


aactacctca 


ccaccgctct 


7320 


ggtcgacggc 


caggagtcca 


tcaacgccta 


cctgcaggcc 


gtcgagggcg 


cctggaagga 


7380 


ctacgcggag 


caggacggcc 


ggtcgctgga 


ggagttcgcg 


gcgttcgtct 


accaccagcc 


7440 


gttcacgaag 


atggcctaca 


aggcgcaccg 


ccacctgctg 


aacttcaacg 


gctacgacac 


7500 


cgacaaggac 


gccatcgagg 


gcgccctcgg 


ccagacgacg 


gcgtacaaca 


acgtcatcgg 


7560 


caacagctac 


accgcgtcgg 


tgtacctggg 


cctggccgcc 


ctgctcgacc 


aggcggacga 


7620 


cctgacgggc 


cgttccatcg 


gcttcctgag 


ctacggctcg 


ggcagcgtcg 


ccgagttctt 


7680 


ctcgggcacc 


gtcgtcgccg 


ggtaccgcga 


gcgtctgcgc 


accgaggcga 


accaggaggc 


7740 


gatcgcccgg 


cgcaagagcg 


tcgactacgc 


cacctaccgc 


gagctgcacg 


agtacacgct - 


7800 
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cccgtccgac ggcggcgacc acgccacccc ggtgcagacc accggcccct tccggctggc 7860 

cgggatcaac gaccacaagc gcatctacga ggcgcgctag cgacacccct cggcaacggg 7920 

gtgcgccact gttcggcgca ccccgtgccg ggctttcgca cagctattca cgaccatttg 7980 

aggggcgggc agccgcatga ccgacgtccg attccgcatt atcggtacgg gtgcctacct 804 0 

agaactagtg gatcccccgg gctgcaggaa ttcgata 8077 

<210> 64 
<211> 8400 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Operon G containing A. thaliana, S. cerevisiae, and S. pombe DNA 



<400> 64 
aaccacaaaa 


ggagttcata 


tgtcagagtt 


gagagccttc 


agtgccccag 


ggaaagcgtt 


60 


actagctggt 


ggatatttag 


ttttagatac 


aaaatatgaa 


gcatttgtag 


tcggattatc 


120 


ggcaagaatg 


catgctgtag 


cccatcctta 


cggttcattg 


caagggtctg 


ataagtttga 


180 


agtgcgtgtg 


aaaagtaaac 


aat ttaaaga 




CLy LdLCdtd 


LdayLoL Lad 


£. ** \J 


aagtggcttc 


attcctgttt 


cgataggcgg 


atctaagaac 


cctttcattg 


aaaaagttat 


300 


cgctaacgta 


tttagctact 


ttaaacctaa 


catggacgac 


tactgcaata 


gaaacttgtt 


360 


cgttattgat 


attttctctg 


atgatgccta 


ccattctcag 


gaggatagcg 


ttaccgaaca 


420 


tcgtggcaac 


agaagattga 


gttttcattc 


gcacagaatt 


gaagaagttc 


ccaaaacagg 


480 


gctgggctcc 


tcggcaggtt 


tagtcacagt 


tttaactaca 


gctttggcct 


ccttttttgt 


540 


atcggacctg 


gaaaataatg 


tagacaaata 


tagagaagtt 


attcataatt 


tagcacaagt 


600 


tgctcattgt 


caagctcagg 


gtaaaattgg 


aagcgggttt 


gatgtagcgg 


cggcagcata 


. 660 


tggatctatc 


agatatagaa 


gattcccacc 


cgcattaatc 


tctaatttgc 


cagatattgg 


720 


aagtgctact 


tacggcagta 


aactggcgca 


tttggttgat 


gaagaagact 


ggaatattac 


780 


gattaaaagt 


aaccatttac 


cttcgggatt 


aactttatgg 


atgggcgata 


ttaagaatgg 


840 


ttcagaaaca 


gtaaaactgg 


tccagaaggt 


aaaaaattgg 


tatgattcgc 


atatgccaga 


900 


aagcttgaaa 


atatatacag 


aactcgatca 


tgcaaattct 


agatttatgg 


atggactatc 


960 


taaactagat 


cgcttacacg 


agactcatga 


cgattacagc 


gatcagatat 


ttgagtctct 


1020 


tgagaggaat 


gactgtacct 


gtcaaaagta 


tcctgaaatc 


acagaagtta 


gagatgcagt 


1080 


tgccacaatt 


agacgttcct 


ttagaaaaat 


aactaaagaa 


tctggtgccg 


atatcgaacc 


1140 


tcccgtacaa 


actagcttat 


tggatgattg 


ccagacctta 


aaaggagttc 


ttacttgctt 


1200 



WO 02/10398 



PCT/US01/24037 



61 



aatacctggt gctggtggtt 


atgacgccat 


tgcagtgatt 


actaagcaag 


atgttgatct 


1260 


tagggctcaa accgctaatg 


acaaaagatt 


ttctaaggtt 


caatggctgg 


atgtaactca 


1320 


ggctgactgg ggtgttagga 


aagaaaaaga 


tccggaaact 


tatcttgata 


aactgcagga 


1380 


ggagttttaa tgtcattacc 


gttcttaact 


tctgcaccgg 


gaaaggttat 


tatttttggt 


1440 


gaacactctg ctgtgtacaa 


caagcctgcc 


gtcgctgcta 


gtgtgtctgc 


gttgagaacc 


1500 


tacctgctaa taagcgagtc 


atctgcacca 


gatactattg 


aattggactt 


cccggacatt 


1560 


agctttaatc ataagtggtc 


catcaatgat 


ttcaatgcca 


tcaccgagga 


tcaagtaaac 


1620 


tcccaaaaat tggccaaggc 


tcaacaagcc 


accgatggct 


tgtctcagga 


actcgttagt 


1680 


cttttggatc cgttgttagc 


tcaactatcc 


gaatccttcc 


actaccatgc 


agcgttttgt 


1740 


ttcctgtata tgtttgtttg 


cctatgcccc 


catgccaaga 


atattaagtt 


ttctttaaag 


1800 


tctactttac ccatcggtgc 


tgggttgggc 


tcaagcgcct 


ctatttctgt 


atcactggcc 


1860 


ttagctatgg cctacttggg 


ggggttaata 


ggatctaatg 


acttggaaaa 


gctgtcagaa 


1920 


aacgataagc atatagtgaa 


tcaatgggcc 


ttcataggtg 


aaaagtgtat 


tcacggtacc 


1980 


ccttcaggaa tagataacgc 


tgtggccact 


tatggtaatg 


ccctgctatt 


tgaaaaagac 


2040 


tcacataatg gaacaataaa 


cacaaacaat 


tttaagttct 


tagatgattt 


cccagccatt 


2100 


ccaatgatcc taacctatac 


tagaattcca 


aggt.ctacaa 


aagatcttgt 


tgctcgcgtt 


2160 


cgtgtgttgg tcaccgagaa 


atttcctgaa 


gttatgaagc 


caattctaga 


tgccatgggt 


2220 


gaatgtgccc tacaaggctt 


agagatcatg 


actaagttaa 


gtaaatgtaa 


aggcaccgat 


2280 


gacgaggctg tagaaactaa 


taatgaactg 


tatgaacaac 


tattggaatt ■ 


ga.taagaata 


2340 


aatcatggac tgcttgtctc 


aatcggtgtt 


tctcatcctg 


gattagaact 


tattaaaaat 


2400 


ctgagcgatg atttgagaat 


tggctccaca 


aaacttaccg 


gtgctggtgg 


cggcggttgc 


2460 


tctttgactt tgttacgaag 


agacattact 


caagagcaaa 


ttgacagctt 


caaaaagaaa 


2520 


ttgcaagatg attttagtta 


cgagacattt 


gaaacagact 


tgggtgggac 


tggctgctgt 


2580 


rtgttaagcg caaaaaattt 


gaataaagat 


cttaaaatca 


aatccctagt 


attccaatta 


2640 


tttgaaaata aaactaccac 


aaagcaacaa 


attgacgatc 


tattattgcc 


aggaaacacg 


2700 


aatttaccat ggacttcaga 


cgaggagttt 


taatgactgt 


atatactgct 


agtgtaactg 


2760 


ctccggtaaa tattgctact 


cttaagtatt 


gggggaaaag 


ggacacgaag 


ttgaatctgc 


2820 


ccaccaattc gtccatatca 


gtgactttat 


cgcaagatga 


cctcagaacg 


ttgacctctg 


2880 


cggctactgc acctgagttt 


gaacgcgaca 


ctttgtggtt 


aaatggagaa 


ccacacagca 


2940 


tcgacaatga aagaactcaa 


aattgtctgc 


gcgacctacg 


ccaattaaga 


aaggaaatgg 


3000 


aatcgaagga cgcctcattg 


cccacattat 


ctcaatggaa 


actccacatt 


gtctccgaaa 


3060 


ataactttcc tacagcagct 


ggtttagctt 


cctccgctgc 


tggctttgct 


gcattggtct 


3120 
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ctgcaattgc taagttatac caattaccac agtcaacttc 
gaaaggggtc tggttcagct tgtagatcgt tgtttggcgg 
gaaaagctga agatggtcat gattccatgg cagtacaaat 
ctcagatgaa agcttgtgtc ctagttgtca gcgatattaa 
agggtatgca attgaccgtg gcaacctccg aactatttaa 
taccaaagag atttgaagtc atgcgtaaag ccattgttga 
caaaggaaac aatgatggat tccaactctt tccatgccac 
caatattcta catgaatgac acttccaagc gtatcatcag 
agttttacgg agaaacaatc gttgcataca cgtttgatgc 
actacttagc tgaaaatgag tcgaaactct ttgcatttat 
ttcctggatg ggacaagaaa tttactactg agcagcttga 
aatcatctaa ctttactgca cgtgaattgg atcttgagtt 
tgattttaac' tcaagtcggt tcaggcccac aagaaacaaa 
agactggtct accaaaggaa gaggagtttt aactcgacgc 
cagaacgttt acattgtatc gactgccaga accccaattg 
tcctccaaga cagcagtgga attgggtgct gttgctttaa 
ccagaattgg atgcatccaa ggattttgac gaaattattt 
aatttgggcc aagctccggc cagacaagtt gctttggctg 
gttgcaagca cagttaacaa ggtctgtgca tccgctatga 
caatccatca aatgtggtaa tgctgatgtt gtcgtagctg 
aacgcaccat actacatgcc agcagcccgt gcgggtgcca 
gttgatggtg tcgaaagaga tgggttgaac gatgcgtacg 
cacgcagaaa agtgtgcccg tgattgggat attactagag 
atcgaatcct accaaaaatc tcaaaaatct caaaaggaag 
gtacctgtta ccattaaggg atttagaggt aagcctgata 
gaacctgcta gattacacgt tgaaaaattg agatctgcaa 
aacggtactg ttactgccgc taacgcttct ccaatcaacg 
ttggtttccg aaaaagtttt gaaggaaaag aatttgaagc 
tggggtgagg ccgctcatca accagctgat tttacatggg 
aaggctttga aacatgctgg catcgaagac atcaattctg 
gaagcctttt cggttgtcgg tttggtgaac actaagattt 
gttaatgtat atggtggtgc tgttgctcta ggtcacccat 



agaaatatct agaatagcaa 


3180 


atacgtggcc tgggaaatgg 


3240 


cgcagacagc tctgactggc 


3300 


aaaggatgtg agttccactc 


3360 


agaaagaatt gaacatgtcg 


3420 


aaaagatttc gccacctttg 


3480 


atgtttggac tctttccctc 


3540 


ttggtgccac accattaatc 


3600 


aggtccaaat gctgtgttgt 


3660 


ctataaattg tttggctctg 


3720 


ggctttcaac catcaatttg 


3780 


gcaaaaggat gttgccagag 


3840 


cgaatctttg attgacgcaa 


3900 


cggcggaggc acatatgtct 


3960 


gttcattcca gggttctcta 


4020 


aaggcgcctt ggctaaggtt 


4 080 


ttggtaacgt tctttctgcc 


4140 


ccggtttgag taatcatatc 


4200 


aggcaatcat tttgggtgct 


4260 


gtggttgtga atctatgact 


4320 


aatttggcca aactgttctt . 


4380 


atggtctagc catgggtgta 


4440 


aacaacaaga caattttgcc 


4500 


gtaaattcga caatgaaatt 


4560 


ctcaagtcac gaaggacgag 


4620 


ggactgtttt ccaaaaagaa 


4680 


atggtgctgc agccgtcatc 


4740 


ctttggctat tatcaaaggt 


4 800 


ctccatctct tgcagttcca 


4860 


ttgattactt tgaattcaat 


4920 


tgaagctaga cccatctaag 


4 980 


tgggttgttc tggtgctaga 


504 0 
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gtggttgtta cactgctatc catcttacag caagaaggag gtaagatcgg tgttgccgcc 5100 

atttgtaatg gtggtggtgg tgcttcctct attgtcattg aaaagatatg aggatcctct 5160 

agatgcgcag gaggcacata tggcgaagaa cgttgggatt ttggctatgg atatctattt 5220 

ccctcccacc tgtgttcaac aggaagcttt ggaagcacat gatggagcaa gtaaagggaa 5280 

atacactatt ggacttggcc aagattgttt agctttttgc actgagcttg aagatgttat 5340 

ctctatgagt ttcaatgcgg tgacatcact ttttgagaag tataagattg accctaacca 5400 

aatcgggcgt cttgaagtag gaagtgagac tgttattgac aaaagcaagt ccatcaagac 54 60 

cttcttgatg cagctctttg agaaatgtgg aaacactgat gtcgaaggtg ttgactcgac 5520 

caatgcttgc tatggtggaa ctgcagcttt gttaaactgt gtcaattggg ttgagagtaa 5580 

ctcttgggat ggacgttatg gcctcgtcat ttgtactgac agcgcggttt atgcagaagg 5640 

acccgcaagg cccactggag gagctgcagc gattgctatg ttgataggac ctgatgctcc 5700 

tatcgttttc gaaagcaaat tgagagcaag ccacatggct catgtctatg acttttacaa 5760 

gcccaatctt gctagcgagt acccggttgt tgatggtaag ctttcacaga cttgctacct 5820 

catggctctt gactcctgct ataaacattt atgcaacaag ttcgagaaga tcgagggcaa 5880 

agagttctcc ataaatgatg ctgattacat. tgttttccat tctccataca ataaacttgt 5940 

acagaaaagc tttgctcgtc tcttgtacaa cgacttcttg agaaacgcaa gctccattga 6000 

cgaggctgcc aaagaaaagt tcacccctta ttcatctttg acccttgacg agagttacca 6060 

aagccgtgat cttgaaaagg tgtcacaaca aatttcgaaa ccgttttatg atgctaaagt 6120 

gcaaccaacg actttaatac caaaggaagt cggtaacatg tacactgctt ctctctacgc 6180 

tgcatttgct tccctcatcc acaataaaca caatgatttg gcgggaaagc gggtggttat 6240 

gttctcttat ggaagtggct ccaccgcaac aatgttctca ttacgcctca acgacaataa 6300 

gcctcctttc agcatttcaa acattgcatc tgtaatggat gttggcggta aattgaaagc 6360 

tagacatgag tatgcacctg agaagtttgt ggagacaatg aagctaatgg aacataggta 6420 

tggagcaaag gactttgtga caaccaagga gggtattata gatcttttgg caccgggaac 6480 

ttattatctg aaagaggttg attccttgta ccggagattc tatggcaaga aaggtgaaga 6540 

tggatctgta gccaatggac actgaggatc cgtcgagcac gtggaggcac atatgcaatg 6600 

ctgtgagatg cctgttggat acattcagat tcctgttggg attgctggtc cattgttgct 6660 

tgatggttat gagtactctg ttcctatggc tacaaccgaa ggttgtttgg ttgctagcac 6720 

taacagaggc tgcaaggcta tgtttatctc tggtggcgcc accagtaccg ttcttaagga 6780 

cggtatgacc cgagcacctg ttgttcggtt cgcttcggcg agacgagctt cggagcttaa 6840 

gtttttcttg gagaatccag agaactttga tactttggca gtagtcttca acaggtcgag 6900 

tagatttgca agactgcaaa gtgttaaatg cacaatcgcg gggaagaatg cttatgtaag 6960 
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a t* 1" c t cr t t a t 

y l— 1— w» W y V— I— y 


acr t a p t cr at a 


atgctatggg gatgaatatg 


gtttctaaag gtgt'gcagaa 


♦ 

7020 


t" rrt t pi - 1" aa a 


t*a , t*pi"i"appCf 


atgatttccc tgacatggat 


gtgattggaa tctctggtaa 


7080 


pt t pt at t pa 


crapaacraaac 

^ Q Vm« d t* wL V<4 U v—* 


ctgctgctgt gaactggatt 


aaqqaacQtcr qtaaatcaqt 


7140 


t" at tt apaa n 


aptat~aatpa 


gaggagagat cgtgaacaag 


gtcttgaaaa cgagcgtggc 


7200 


tapttt a at p 


aa apt paaca 


tgctcaagaa cctagctggc 


tctgctgttg caggctctct 


7260 


aaafaaat t p 


aaccrctcatcr 


ccagtaacat agtgtctgct 


gtattcatag ctactggcca 


7320 


aaatppaapt 


paaaaccrt Cfa 


agagttctca atgcatcacc 


atgatggaag ctattaatga 


7380 


pfloraaa aa t 

^— >y y v«« ci ex ci y a l» 


at ppatatpt 


cagtcactat gccatctatc 


aaaatqaqqa caqtagqagg 


7440 


ayyaaocj-^ciy 


pt t" apa t pt p 


aatcaacata tttaaacctcr 


cteggagtta aaggagcaag 


7500 


/-» -a na rr3 rrt P PT 


rTTtnnzi a t aa 

l> v_«y yy u a. Lua 


a pnpaacfcracr Qctacrccraca 


ategtagecg gagcagtttt 


7560 


ctyL^Lyyctyciy 


tt airttt-aa 

L. LOLL. L L. L. CI CI 


tcrtpaapaat tcrcaactacra 


cagcttgtga gaagtcacat 


7620 


Ljaaa LaL.ua L. 


dy a l. cx y ^ l* 


naoacatctc taaaacaaca 

d CI Q L- V-^ V— x*^ I— y ^ U ^ 


acaacgacaa caacaacaac 


7680 


a'hrrapprTT+'a 

d LLJdL.L-L.y t_ d 


yyciyy\_-civ_>ci.L. 


at" na rrt t ppp aapaacraaaa 


aaaggattat gatgaagaac 


7740 


ciciLLclciyyLL 


yet L.yydciycici 


fTi""ht"n"t"at*PCT ttatacratcra 

y L L Ly Uu UL>y n-y uayaiya 


aaatgatgtc cctttaagat 


7800 


a t <"T/T:a a r*rr:a a 


a a ^ rrrra rrt" rrt 
ci ci d y y ca, y l. y u 


pa"h1"f" crat era aaaatataaa 


taaaggtctt ttgeatagag 


78 60 


/-i 4~ 4- /~«"f" 4- -3 4- 
CaLLCLCLal. 


y L L L.d LtLLC 




act.iicaoca.a cotccaqaaq 


7920 


ayaaaaLLau 


at u. l.i*i*ci lv»»\— 


1"hai*nrfaraa a t" a pat* at t* a 


ctcccaccca ttggatgttg 


7980 


<— Lyy uyaauy 


f- rrrrt a 3 1~ 3 pt 
l y y luu L.u v — i_ 


ttacctgaag ctgttgaagg 


tgttaagaat gcagctcaac 


8040 




UUa L.y cia l. L. y 


aat at t paaa ccaaatatat 


tcccaaagac aaatttcagt 


8100 


ttcttacacg 


aatccattac 


cttgctccta gtactggtgc 


ttggggagag catgaaattg 


olbU 


actacattct 


tttcttcaaa 


ggtaaagttg agctggatat 


caatcccaat gaagttcaag 


8220 


cctataagta 


tgttactatg 


gaagagttaa aagagatgtt 


ttccgatcct caatatggat 


8280 


tcacaccatg 


gttcaaactt 


atttgtgagc attttatgtt 


taaatggtgg caggatgtag 


8340 


ateatgegtc 


aaaattccaa 


gataccttaa ttcatcgttg 


ctaaggatcc cccgggatcc 


8400 



<210> 65 

<211> 55 

<212> DNA • . 

<213> Artificial Sequence 

<220> 

<223> PCR primer containing R. capsulatus DNA 
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<400> 65" 

gcgatatcgg atccaggagg accatatgat cgccgaagcg gatatggagg tctgc 55 

<210> 66 
<211> 50 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> PCR primer containing R. capsulatus DNA 
<400> 66 

gcgatatcaa gcttggatcc tcaatccatc gccaggccgc ggtcgcgcgc 50 

<210> 67 
<211> 60 
<212> DNA 

<213> Artificial Sequence 

<220> * 

<223> Oligonucleotide containing N. tabacum and R. caopsulatus DNA 

<400> 67 

ctttcctgaa acataattta taatcagatc caggaggacc atatgatcgc cgaa-gcggat 60 

<210> 68 
<211> 60 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Oligonucleotide containing N. tabacum and R. capsulatus DNA 
<400> 68 

cgaccgcggc ctggcgatgg attgaggatc taaacaaacc cggaacagac cgttgggaag 60 

<210> 69 

<211> 60 

<212> DNA 

<213> Artificial Sequence 
<220> 
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<223> Oligonucleotide containing N. tabacum and R. capsulatus DNA 



60 



60 



<400> 69 

atttttcatc tcgaattgta ttcccacgaa ggccgcgtcg actacggccg caggaggagt 



<210> 70 
<211> 60- 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Oligonucleotide containing N. tabacum and R. capsulatus DNA 
<400> 70 

ttcggatcga tcctgcgcgg ctgagcggcc ggaatggtga agttgaaaaa cgaatccttc 

<210> 71 

<211> 1020 

<212> DNA 

<213> Rhodobacter capsulatus 



<400> 71 

atgatcgccg aagcggatat ggaggtctgc cgggagctga tccgcaccgg cagctactcc . 60 

ttccatgcgg cgtccagagt tctgccggcg cgggtccgtg accccgcgct ggcgctttac 120 

gccttttgcc gcgtcgccga tgacgaagtc gacgaggttg gcgcgccgcg cgacaaggct 180 

gcggcggttt tgaaacttgg cgaccggctg gaggacatct atgccggtcg tccgcgcaat 24 0 

gcgccctcgg atcgggcttt cgcggcggtg gtcgaggaat tcgagatgcc gcgcgaattg 30.0 

cccgaggcgc tgctggaggg cttcgcctgg gatgccgagg ggcggtggta tcacacgctt 360 

tcggacgtgc aggcctattc ggcgcgggtg gcggccgccg tcggcgcgat gatgtgcgtg 420 

ctgatgcggg tgcgcaaccc cgatgcgctg gcgcgggcct gcgatctcgg tcttgccatg 4 80 

cagatgtcga acatcgcccg cgacgtgggc gaggatgccc gggcggggcg gcttttcctg 540 
ccgaccgact ggatggtcga ggaggggatc gatccgcagg cgttcctggc cgatccgcag 
cccaccaagg gcatccgccg ggtcaccgag cggttgctga accgcgccga ccggctttac 
tggcgggcgg cgacgggggt gcgg'cttttg ccctttgact gccgaccggg gatcatggcc 

gcgggcaaga tctatgccgc gatcggggcc gaggtggcga aggcgaaata cgacaacatc 780 



600 
660 
720 
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acccggcgtg cccacacgac caagggccgc aagctgtggc tggtggcgaa ttccgcgatg 84 0 

tcggcgacgg cgacctcgat gctgccgctc tcgccgcggg tgcatgccaa gcccgagccc 900 

gaagtggcgc atctggtcga tgccgccgcg catcgcaacc tgcatcccga acggtccgag 960 

gtgctgatct cggcgctgat ggcgctgaag gcgcgcgacc gcggcctggc gatggattga 1020 

<210> 72 
<211> 13917 
<212> DNA 

<213> Artificial Sequence 

misc_f eature 
()..() 

Plastid transformation vector pHK04, containing Operon B, contain 
i 

<400> 72 

gcacttttcg gggaaatgtg cgcggaaccc ctatttgttt atttttctaa atacattcaa 60 

atatgtatcc gctcatgaga caataaccct gataaatgct tcaataatat tgaaaaagga 120 

agagtatgag tattcaacat ttccgtgtcg cccttattcc cttttttgcg gcattttgcc 180 

ttcctgtttt tgctcaccca gaaacgctgg tgaaagtaaa agatgctgaa gatcagttgg 240 

gtgcacgagt gggttacatc gaactggatc tcaacagcgg taagatcctt gagagttttc 300 

gccccgaaga acgttttcca atgatgagca cttttaaagt tctgctatgt ggcgcggtat 360 

tatcccgtat tgacgccggg caagagcaac tcggtcgccg catacactat tctcagaatg 420 

acttggttga gtactcacca* gtcacagaaa agcatcttac ggatggcatg acagtaagag 480 

aattatgcag tgctgccata accatgagtg ataacactgc ggccaactta cttctgacaa 54 0 

cgatcggagg accgaaggag ctaaccgctt ttttgcacaa catgggggat catgtaactc 600 

gccttgatcg ttgggaaccg gagctgaatg aagccatacc aaacgacgag cgtgacacca 660 

cgatgcctgt agcaatggca acaacgttgc gcaaactatt aactggcgaa ctacttactc 720 

tagcttcccg gcaacaatta atagactgga tggaggcgga taaagttgca ggaccacttc 780 

tgcgctcggc ccttccggct ggctggttta ttgctgataa atctggagcc ggtgagcgtg 84 0 



<220> 
<221> 
<222> 
<223> 
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ggtctcgcgg tatcattgca gcactggggc cagatggtaa gccctcccgt atcgtagtta 900 

tctacacgac ggggagtcag gcaactatgg. atgaacgaaa tagacagatc gctgagatag 960 

gtgcctcact gattaagcat tggtaactgt cagaccaagt ttactcatat atactttaga 1020 

ttgatttaaa acttcatttt taatttaaaa ggatctaggt gaagatcctt tttgataatc 1080 

tcatgaccaa aatcccttaa cgtgagtttt cgttccactg agcgtcagac cccgtagaaa 114 0 

agatcaaagg atcttcttga gatccttttt ttctgcgcgt aatctgctgc ttgcaaacaa 1200 

aaaaaccacc gctaccagcg gtggtttgtt tgccggatca agagctacca actctttttc 1260 

cgaaggtaac tggcttcagc agagcgcaga taccaaatac tgtccttcta gtgtagccgt 1320 

agttaggcca ccacttcaag aactctgtag caccgcctac atacctcgct ctgctaatcc 1380 

tgttaccagt ggctgctgcc agtggcgata agtcgtgtct taccgggttg gactcaagac 14 4 0 

gatagttacc ggataaggcg cagcggtcgg gctgaacggg gggttcgtgc acacagccca 1500 

gcttggagcg aacgacctac accgaactga gatacctaca gcgtgagcta tgagaaagcg i560 

ccacgcttcc cgaagggaga aaggcggaca ggtatccggt aagcggcagg gtcggaacag 1620 

gagagcgcac gagggagctt ccagggggaa acgcctggta tctttatagt cctgtcgggt 1680 

ttcgccacct ctgacttgag cgtcgatttt tgtgatgctc gtcagggggg cggagcctat 1740. 

ggaaaaacgc cagcaacgcg gcctttttac ggttcctggc cttttgctgg ccttttgctc 1800. 

acatgttctt tcctgcgtta tcccctgatt ctgtggataa ccgtattacc gcctttgagt 1860 

gagctgatac cgctcgccgc agccgaacga ccgagcgcag cgagtcagtg agcgaggaag 1920 

cggaagagcg cccaatacgc aaaccgcctc tccccgcgcg ttggccgatt cattaatgca 1980 

gctggcacga caggtttccc gactggaaag cgggcagtga gcgcaacgca attaatgtga 204 0 

gttagctcac tcattaggca ccccaggctt tacactttat gcttccggct cgtatgttgt 2100 

gtggaattgt gagcggataa caatttcaca caggaaacag ctatgaccat gattacgcca 2160 

agctcgaaat taaccctcac taaagggaac aaaagctgga gctccaccgc ggtggcggcc 2220 

gctctagaac tagtggatct tcttggctgt tattcaaaag gtccaacaat gtatatatat 2280 

tggacatttt gaggcaatta tagatcctgg aaggcaattc tgattggtca ataaaaatcg 234 0 

atttcaatgc tatttttttt ttgtttttta tgagtttagc caatttatca tgaaaggtaa 2400 

aaggggataa aggaaccgtg tgttgattgt cctgtaaata taagttgtct tcctccatat 2460 

gtaaaaaggg aataaataaa tcaattaaat ttcgggatgc ttcatgaagt gcttctttcg 2520 

gagttaaact tccgtttgtc catatttcga gaaaaagtat ctcttgtttt tcattcccat 2580 

tcccataaga atgaatacta tgattcgcgt ttcgaacagg catgaataca gcatctatag 2640 

gataacttcc atcttgaaag ttatgtggcg tttttataag atatccacga tttctctcta 2700 

tttgtaatcc aatacaaaaa tcaattggtt ccgttaaact ggctatatgt tgtgtattat 27 60 
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caacgatttc tacataaggc ggcaagatga tatcttgggc agttacagat ccaggaccct 2820 

tgacacaaat agatgcgtca gaagttccat atagattact tcttaatata atttctttca 2880 

aattcattaa aatttcatgt accgattctt gaatgcccgt tatggtagaa tattcatgtg 2940 

ggactttctc agattttaca cgtgtgatac atgttccttc tatttctcca agtaaagctc 3000 

ttcgcatcgc aatgcctatt gtgtcggctt ggcctttcat aagtggagac agaataaagc 3CT60 

gtccataata aaggcgttta ctgtctgttc ttgattcaac acacttccac tgtagtgtcc 3120 

gagtagatac tgttactttc tctcgaacca tagtactatt atttgattag atcatcgaat 3180 

cttttatttc tcttgagatt tcttcaatgt tcagttctac acacgtcttt ttttcggagg 3240 

tctacagcca ttatgtggca taggagttac atcccgtacg aaagttaata gtataccact 3300 

tcgacgaata gctcgtaatg ctgcatctct tccgagaccg ggacctttta tcatgacttc 3360 

tgctcgttgc ataccttgat ccactactgt acggatagcg tttgctgctg cggtttgagc 3420 

agcaaacggt gttcctcttc tcgtaccttt gaatccagaa gtaccggcgg aggaccaaga 3480 

aactactcga ccccgtacat ctgtaacagt gacaatggta ttattgaaac ttgcttgaac 3540 

atgaataact ccctttggta ttctacgtgc acccttacgt gaaccaatac gtccattcct 3600 

acgcgaacta attttcggta tagcttttgc catattttat catctcgtaa atatgagtca 3660 

gagatatatg gatatatcca tttcatgtca aaacagattc tttatttgta catcggctct 3720 

. tctggcaagt ctgattatcc ctgtctttgt ttatgtctcg ggttggaaca aattactata 3780 

attcgtcccc gcctacggat tagtcgacat ttttcacaaa ttttacgaac ggaagctctt 384 0 

attttcatat ttctcattcc ttaccttaat tctgraatcta tttcttggaa gaaaataagt 3900 

ttcttgaaat ttttcatctc gaattgtatt cccacgaaag gaatggtgaa gttgaaaaac 3960 

gaatccttca aatctttgtt gtggagtcga taaattatac gccctttggt tgaatcataa 4020 

ggacttactt caattttgac tctatctcct ggcagtatcc gtataaaact atgccggatc 4080 

tttcctgaaa cataatttat aatcagatcg gccgcaggag gagttcatat gtcagagttg 414 0 

agagccttca gtgccccagg gaaagcgtta ctagctggtg gatatttagt tttagataca 4200 

aaatatgaag catttgtagt cggattatcg gcaagaatgc atgctgtagc ccatccttac 4260 

ggttcattgc aagggtctga taagtttgaa gtgcgtgtga aaagtaaaca atttaaagat 4320 

ggggagtggc tgtaccatat aagtcctaaa agtggcttca ttcctgtttc gataggcgga 4380 

tctaagaacc ctttcattga aaaagttatc gctaacgtat ttagctactt taaacctaac 4440 

atggacgact actgcaatag aaacttgttc gttattgata ttttctctga tgatgcctac 4500 

cattctcagg aggatagcgt taccgaacat cgtggcaaca gaagattgag ttttcattcg 4560 

cacagaattg aagaagttcc caaaacaggg ctgggctcct cggcaggttt agtcacagtt 4 620 

ttaactacag ctttggcctc cttttttgta tcggacctgg aaaataatgt agacaaatat 4 680 
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agagaagtta ttcataattt agcacaagtt gctcattgtc aagctcaggg taaaattgga 4740 

agcgggtttg atgtagcggc ggcagcatat ggatctatca gatatagaag attcccaccc 4800 

gcattaatct ctaatttgcc agatattgga agtgctactt acggcagtaa actggcgcat 4860 

ttggttgatg aagaagactg gaatattacg attaaaagta accatttacc ttcgggatta 4 920 

actttatgga tgggcgatat taagaatggt tcagaaacag taaaactggt ccagaaggta 4 980 

aaaaattggt atgattcgca tatgccagaa agcttgaaaa tatatacaga actcgatcat 504 0 

gcaaattcta gatttatgga tggactatct aaactagatc gcttacacga gactcatgac 5100 

gattacagcg atcagatatt tgagtctctt gagaggaatg actgtacctg tcaaaagtat 5160 

cctgaaatca cagaagttag agatgcagtt gccacaatta gacgttcctt tagaaaaata 5220 

actaaagaat ctggtgccga tatcgaacct cccgtacaaa ctagcttatt ggatgattgc 5280 

cagaccttaa aaggagttct tacttgctta atacctggtg ctggtggtta tgacgccatt 5340 

gcagtgatta ctaagcaaga tgttgatctt agggctcaaa ccgctaatga caaaagattt 5400 

tctaaggttc aatggctgga tgtaactcag gctgactggg gtgttaggaa agaaaaagat 54 60 

ccggaaactt atcttgataa actgcaggag gagttttaat gtcattaccg ttcttaactt 5520 

ctgcaccggg aaaggttatt atttttggtg aacactctgc tgtgtacaac aagcctgccg. 5580 

tcgctgctag tgtgtctgcg ttgagaacct acctgctaat aagcgagtca tctgcaccag 5640 

atactattga attggacttc ccggacatta gctttaatca taagtggtcc atcaatgatt 5700 

tcaatgccat caccgaggat caagtaaact cccaaaaatt ggccaaggct caaCaagcca 5760 

ccgatggctt gtctcaggaa ctcgttagtc ttttggatcc gttgttagct caactatccg 5820 

aatccttcca ctaccatgca gcgttttgtt tcctgtatat gtttgtttgc ctatgccccc 5880 

atgccaagaa tattaagttt tctttaaagt ctactttacc catcggtgct gggttgggct 5940 

caagcgcctc tatttctgta tcactggcct tagctatggc ctacttgggg gggttaatag 6000 

gatctaatga cttggaaaag ctgtcagaaa acgataagca tatagtgaat caatgggcct 6060 

tcataggtga aaagtgtatt cacggtaccc cttcaggaat agataacgct gtggccactt 6120 

atggtaatgc cctgctattt gaaaaagact cacataatgg aacaataaac acaaacaatt 618 0 

ttaagttctt agatgatttc ccagccattc caatgatcct aacctatact agaattccaa 624 0 

ggtctacaaa agatcttgtt gctcgcgttc gtgtgttggt caccgagaaa tttcctgaag 6300 

ttatgaagcc aattctagat gccatgggtg aatgtgccct acaaggctta gagatcatga 6360 

ctaagttaag taaatgtaaa ggcaccgatg acgaggctgt agaaactaat aatgaactgt 6420 

atgaacaaet attggaattg ataagaataa atcatggact gcttgtctca atcggtgttt 6480 

ctcatcctgg attagaactt attaaaaatc tgagcgatga tttgagaatt ggctccacaa 6540 

aacttaccgg tgctggtggc ggcggttgct ctttgacttt gttacgaaga gacattactc 6600 



WO 02/10398 



PCT/US01/24037 



71 



aagagcaaat 


tgacagcttc 


aaaaagaaat 


tgcaagatga 


ttttagttac 


gagacatttg 


6660 


aaacagactt 


gggtgggact 


ggctgctgtt 


tgttaagcgc 


aaaaaatttg 


aataaagatc 


6720 


ttaaaatcaa 


atccctagta 


ttccaattat 


ttgaaaataa 


aactaccaca 


aagcaacaaa 


6780 


ttgacgatct 


attattgcca 


ggaaacacga 


atttaccatg 


gacttcagac 


gaggagtttt 


6840 


aatgactgta 


tatactgcta 


gtgtaactgc 


tccggtaaat 


attgctactc 


ttaagtattg 


6900 


ggggaaaagg 


gacacgaagt 


tgaatctgcc 


caccaattcg 


tccatatcag 


tgactttatc. 


6960 


gcaagatgac 


ctcagaacgt 


tgacctctgc 


ggctactgca 


cctgagtttg 


aacgcgacac 


7020 


tttgtggtta 


aatggagaac 


cacacagcat 


cgacaatgaa 


agaactcaaa 


attgtctgcg 


7080 


cgacctacgc 


caattaagaa 


aggaaatgga 


atcgaaggac 


gcctcattgc 


ccacattatc 


7140 


tcaatggaaa 


ctccacattg 


tctccgaaaa 


taactttcct 


acagcagctg 


gtttagcttc 


7200 


ctccgctgct 


ggctttgctg 


cattggtctc 


tgcaattgct 


aagttatacc 


aattaccaca 


7260 


gtcaacttca 


gaaatatcta 


gaatagcaag 


aaaggggtct 


ggttcagctt 


gtagatcgtt 


7320 


gtttggcgga 


tacgtggcct 


gggaaatggg 


aaaagctgaa 


gatggtcatg 


attccatggc 


7380 


agtacaaatc 


gcagacagct 


ctgactggcc 


tcagatgaaa 


gcttgtgtcc 


tagttgtcag 


7440 


cgatattaaa 


aaggatgtga 


gttccactca 


gggtatgcaa 


ttgaccgtgg 


caacctccga 


7500 


actatttaa'a 


gaaagaattg 


aacatgtcgt 


accaaagaga 


tttgaagtca 


tgcgtaaagc 


7560 


cattgttgaa 


aaagatttcg 


ccacctttgc 


aaaggaaaca 


atgatggatt 


ccaactcttt 


7620 


ccatgccaca 


tgtttggact 


ctttccctcc 


aatattctac 


atgaatgaca 


cttccaagcg 


7680 


tatcatcagt 


tggtgccaca 


ccattaatca 


gttttacgga 


gaaacaatcg 


ttgcatacac 


7740 


gtttgatgca 


ggtccaaatg 


ctgtgttgta 


ctacttagct 


gaaaatgagt 


cgaaactctt 


7800 


tgcatttatc 


tataaattgt 


ttggctctgt 


tcctggatgg 


gacaagaaat 


ttactactga 


7860 


gcagcttgag 


gctttcaacc 


atcaatttga 


atcatctaac 


tttactgcac 


gtgaattgga 


7920 


tcttgagttg 


caaaaggatg 


ttgccagagt 


gattttaact 


caagtcggtt 


caggcccaca 


7980 


agaaacaaac 


gaatctttga 


ttgacgcaaa 


gactggtcta 


ccaaaggaag 


aggagtttta 


8040 


actcgacgcc 


ggcggaggca 


catatgtctc 


agaacgttta 


cattgtatcg 


actgccagaa 


8100 


ccccaattgg 


ttcattccag 


ggttctctat 


cctccaagac 


agcagtggaa 


ttgggtgctg 


8160 


ttgctttaaa 


aggcgccttg 


gctaaggttc 


cagaattgga 


tgcatccaag 


gattttgacg 


8220 


aaattatttt 


tggtaacgtt 


ctttctgcca 


atttgggcca 


agctccggcc 


agacaagttg 


8280 


ctttggctgc 


cggtttgagt 


aatcatatcg 


ttgcaagcac 


agttaacaag 


gtctgtgcat 


8340 


ccgctatgaa . 


ggcaatcatt 


ttgggtgctc 


aatccatcaa 


atgtggtaat 


gctgatgttg 


8400 


tcgtagctgg 


tggttgtgaa 


tctatgacta 


acgcaccata 


ctacatgcca 


gcagcccgtg 


8460 


cgggtgccaa 


atttggccaa 


actgttcttg 


ttgatggtgt 


cgaaagagat 


gggttgaacg 


8520 
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atgcgtacga tggtctagcc atgggtgtac acgcagaaaa gtgtgcccgt gattgggata 8580 

ttactagaga acaacaagac aattttgcca tcgaatccta ccaaaaatct caaaaatctc 8 64 0 

aaaaggaagg taaattcgac aatgaaattg tacctgttac cattaaggga tttagaggta 8700 

agcctgatac tcaagtcacg aaggacgagg aacctgctag attacacgtt gaaaaattga 87 60 

gatctgcaag gactgttttc caaaaagaaa acggtactgt tactgccgct aacgcttctc 8820 

caatcaacga tggtgctgca gccgtcatct tggtttccga aaaagttttg aaggaaaaga 8880 

atttgaagcc tttggctatt atcaaaggtt ggggtgaggc cgctcatcaa ccagctgatt 8 940 

ttacatgggc tccatctctt gcagttccaa aggctttgaa acatgctggc atcgaagaca 9000 

tcaattctgt tgattacttt gaattcaatg aagccttttc ggttgtcggt ttggtgaaca 9060 

ctaagatttt gaagctagac ccatctaagg ttaatgtata tggtggtgct gttgctctag 9120 

gtcacccatt gggttgttct ggtgctagag tggttgttac actgctatcc atcttacagc 9180 

aagaaggagg taagatcggt gttgccgcca tttgtaatgg tggtggtggt gcttcctcta 924 0 

ttgtcattga aaagatatga ggatcct'cta gatgcgcagg aggcacatat ggcgaagaac 9300 

gttgggattt tggctatgga tatctatttc cctcccacct gtgttcaaca ggaagctttg 9360 

gaagcacatg atggagcaag taaagggaaa tacactattg gacttggcca agattgttta 9420 

gctttttgca ctgagcttga agatgttatc tctatgagtt tcaatgcggt gacatcactt 9480 

tttgagaagt ataagattga ccctaaccaa atcgggcgtc ttgaagtagg aagtgagact 954 0 

gttattgaca aaagcaagtc catcaagacc ttcttgatgc agctctttga gaaatgtgga 9600 

aacactgatg tcgaaggtgt tgactcgacc- aatgcttgct atggtggaac tgcagctttg 9660 

ttaaactgtg tcaattgggt tgagagtaac tcttgggatg gacgttatgg cctcgtcatt 9720 

tgtactgaca gcgcggttta tgcagaagga cccgcaaggc ccactggagg agctgcagcg 9780 

attgctatgt tgataggacc tgatgctcct atcgttttcg aaagcaaatt . gagagcaagc 9840 

cacatggctc atgtctatga cttttacaag cccaatcttg ctagcgagta cccggttgtt 9900 

gatggtaagc tttcacagac ttgctacctc atggctcttg actcctgcta taaacattta' 9960 
tgcaacaagt tcgagaagat cgagggcaaa gagttctcca taaatgatgc tgattacatt . 10020 

gttttccatt ctccatacaa taaacttgta cagaaaagct ttgctcgtct cttgtacaac 10080 

gacttcttga gaaacgcaag ctccattgac gaggctgcca aagaaaagtt caccccttat 10140 

tcatctttga cccttgacga gagttaccaa agccgtgatc ttgaaaaggt gtcacaacaa 10200 

atttcgaaac cgttttatga tgctaaagtg caaccaacga ctttaatacc aaaggaagtc 10260 

ggtaacatgt acactgcttc tctctacgct gcatttgctt ccctcatcca ' caataaacac 10320 

aatgatttgg cgggaaagcg ggtggttatg ttctcttatg gaagtggctc caccgcaaca 10380 

atgttctcat tacgcctcaa cgacaataag cctcctttca gcatttcaaa cattgcatct 104 4 0 
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gtaatggatg ttggcggtaa attgaaagct agacatgagt atgcacctga gaagtttgtg 10500 

gagacaatga agctaatgga acataggtat ggagcaaagg actttgtgac aaccaaggag 10560 

ggtattatag atcttttggc accgggaact tattatctga aagaggttga ttccttgtac 10620 

cggagattct atggcaagaa aggtgaagat ggatctgtag ccaatggaca ctgaggatcc 10680 

gtcgagcacg tggaggcaca tatgcaatgc tgtgagatgc ctgttggata cattcagatt 1074 0 

cctgttggga ttgctggtcc attgttgctt gatggttatg agtactctgt tcctatggct 10800 

acaaccgaag gttgtttggt tgctagcact aacagaggct gcaaggctat gtttatctct 10860 

ggtggcgcca ccagtaccgt tcttaaggac ggtatgaccc gagcacctgt tgttcggttc 10920 

gcttcggcga gacgagcttc ggagcttaag tttttcttgg agaatccaga gaactttgat 10980 

actttggcag tagtcttcaa. caggtcgagt agatttgcaa gactgcaaag tgttaaatgc 11040 

acaatcgcgg ggaagaatgc ttatgtaagg ttctgttgta gtactggtga tgctatgggg 11100 

atgaatatgg tttctaaagg tgtgcagaat gttcttgagt atcttaccga tgatttccct 11160 

gacatggatg tgattggaat ctctggtaac ttctgttcgg acaagaaacc tgctgctgtg 11220 

aactggattg agggacgtgg taaatcagtt gtttgcgagg ctgtaatcag aggagagatc 11280 

gtgaacaagg tcttgaaaac gagcgtggct gctttagtcg agctcaacat gctcaagaac 11340 

ctagctggct ctgctgttgc aggctctcta ggtggattca acgctcatgc cagtaacata 11400 

gtgtctgctg tattcatagc tactggccaa gatccagctc aaaacgtgga gagttctcaa 11460 

tgcatcacca tgatggaagc tattaatgac ggcaaagata tccatatctc agtcactatg 11520 

ccatctatcg aggtggggac agtgggagga ggaacacagc ttgcatctca atcagcgtgt .11580 

ttaaacctgc tcggagttaa aggagcaagc acagagtcgc cgggaatgaa cgcaaggagg 11640 

ctagcgacga tcgtagccgg agcagtttta gctggagagt tatctttaat . gtcagcaatt 11700 

gcagctggac agcttgtgag aagtcacatg aaatacaata gatccagccg agacatctct 11760 

ggagcaacga caacgacaac aacaacaaca tgacccggga tccggccgat ctaaacaaac 11820 

ccggaacaga ccgttgggaa gcgattcagt aattaaagct tcatgactcc tttttggttc 11880 

ttaaagtccc tttgaggtat caactaataa gaaagatatt agacaacccc ccttttttct 1194 0 

ttttcacaaa taggaagttt cgaatccaat ttggatatta aaaggattac cagatataac 12000 

acaaaatctc tccacctatt ccttctagtc gagcctctcg gtctgtcatt atacctcgag 12060 

aagtagaaag aattacaatc cccattccac ctaaaattcg cggaattcgt tgataattag 12120 

aatagattcg tagaccaggt cgactgattc gttttaaatt taaaatattt ctatagggtc 12180 

ttttcctatt ccttctatgt cgcagggtta aaaccaaaaa atatttgttt ttttctcgat 1224 0 

gttttctcac gttttcgata aaaccttctc gtaaaagtat ttgaacaata ttttcggtaa 12300 

tattagtaga tgctattcga accacccttt ttcgatccat atcagcattt cgtatagaag 12360 
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ttattatctc agcaatagtg tccctaccca tgatgaacta aaattattgg ggcctccaaa 12420 

tttgatataa tcaacgtgtt ttttacttat tttttttttg aatatgatat gaattattaa 12480 

agatatatgc gtgagacaca atctactaat taatctattt ctttcaaata ccccactaga 1254 0 

aacagatcac aatttcattt tataatacct cgggagctaa tgaaactatt ttagtaaaat 12600 

ttaattctct caattcccgg gcgattgcac caaaaattcg agttcctttt gatttccttc 12660 

cttcttgatc aataacaact gcagcattgt catcatatcg tattatcatc ccgttgtcac 12720 

gtttgagttc tttacaggtc cgcacaatta cagctctgac tacttctgat ctttctaggg 12780 

gcatatttgg tacggcttct ttgatcacag caacaataac gtcaccaata tgagcatatc 12840 

gacgattgct agctcctatg attcgaatac acatcaattc tcgagccccg ctgttatccg 12900 

ctacatttaa atgggtctga ggttgaatca tttttttaat ccgttctttg aatgcaaagg 12960 

gcgaagaaaa aaaagaaata tttttgtcca aaaaaaaaga aacatgcggt ttcgtttcat 13020 

atctaagagc cctttccgca tttttttcta ttacattacg aaataatgaa ttgagttcgt 13080 

ataggcattt tagatgctgc tagtgaaata gcccttctgg ctatattttc tgttactcca 13140 

cccatttcat aaagtattcg acccggttta acaacagcta cccaatattc aggggatccc 13200 

ccgggctgca ggaattcgat atcaagctta tcgataccgt cgacctcgag ggggggcccg 13260 

gtacccaatt cgccctatag tgagtcgtat tacaattcac tggccgtcgt tttacaacgt 13320 

cgtgactggg aaaaccctgg cgttacccaa cttaatcgcc ttgcagcaca tccccctttc 13380 

gccagctggc gtaatagcga agaggcccgc accgatcgcc cttcccaaca gttgcgcagc 134 40 

ctgaatggcg aatgggacgc gccctgtagc ggcgcattaa gcgcggcggg tgtggtggtt 13500 

acgcgcagcg tgaccgctac acttgccagc gccctagcgc ccgctccttt cgctttcttc 13560 

ccttcctttc tcgccacgtt cgccggcttt ccccgtcaag ctctaaatcg ggggctccct 13620 

ttagggttcc gatttagtgc tttacggcac ctcgacccca aaaaacttga ttagggtgat 13680 

ggttcacgta gtgggccatc gccctgatag acggtttttc gccctttgac gttggagtcc 13740 

acgttcttta atagtggact cttgttccaa actggaacaa cactcaaccc tatctcggtc 13800 

tattcttttg atttataagg gattttgccg atttcggcct attggttaaa aaatgagctg 138 60 

atttaacaaa aatttaacgc gaattttaac aaaatattaa cgcttacaat ttaggtg 13917 

<210> 73 
<211> 7252 
<212> DNA 

<213> Artificial Sequence 
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<220> 

<221> misc_feature 
<222> ()..() 

<223> Plastid transformation vector pHK07, containing Operon C, contain 
i 



<400> 73 
gcacttttcg 


gggaaatgtg 


cgcggaaccc 


ctatttgttt 


atttttctaa 


atacattcaa 


60 


atatgtatcc 


gctcatgaga 


caataaccct 


gataaatgct 


tcaataatat 


tgaaaaagga 


120 


agagtatgag 


tattcaacat 


ttccgtgtcg 


cccttattcc 


cttttttgcg 


gcattttgcc 


180 


ttcctgtttt 


tgctcaccca 


gaaacgctgg 


tgaaagtaaa 


agatgctgaa 


gatcagttgg 


240 


gtgcacgagt 


gggttacatc 


gaactggatc 


tcaacagcgg 


taagatcctt 


gagagttttc 


300 


gccccgaaga 


acgttttcca 


atgatgagca 


cttttaaagt 


tctgctatgt 


ggcgcggtat 


360 


tatcccgtat 


tgacgccggg 


caagagcaac 


tcggtcgccg 


catacactat 


tctcagaatg 


420 


acttggttga 


gtactcacca' 


gtcacagaaa 


agcatcttac 


ggatggcatg 


acagtaagag 


480 


aattatgcag 


tgctgccata 


accatgagtg 


ataacactgc 


ggccaactta 


cttctgacaa 


540 


cgatcggagg 


accgaaggag 


ctaaccgctt 


ttttgcacaa 


catgggggat 


catgtaactc 


600 


gccttgatcg 


ttgggaaccg 


gagctgaatg 


aagccatace 


aaacgacgag 


cgtgacacca 


660 


cgatgcctgt 


agcaatggca 


acaacgttgc 


gcaaactatt 


aactggcgaa 


ctacttactc- , 


720 


tagcttcccg 


gcaacaatta 


atagactgga 


tggaggcgga 


taaagttgca 


ggaccacttc 


780 


tgcgctcggc 


ccttccggct 


ggctggttta 


ttgctgataa 


atctggagcc ggtgagcgtg 


840 


ggtctcgcgg 


tatcattgca 


gcactggggc 


cagatggtaa 


gccctcccgt 


atcgtagtta 


900 


tctacacgac 


ggggagtcag 


gcaactatgg 


atgaacgaaa 


tagacagatc 


gctgagatag 


960 


gtgcctcact 


gattaagcat 


tggtaactgt 


cagaccaagt 


ttactcatat atactttaga 


1020 


ttgatttaaa 


acttcatttt 


taatttaaaa 


ggatctaggt 


gaagatcctt 


tttgataatc 


1080 


tcatgaccaa 


aatcccttaa 


cgtgagtttt 


cgttccactg 


agcgtcagac cccgtagaaa 


1140 


agatcaaagg 


atcttcttga 


gatccttttt 


ttctgcgcgt 


aatctgctgc ttgcaaacaa 


1200 


aaaaaccacc 


gctaccagcg 


gtggtttgtt 


tgccggatca 


agagctacca 


actctttttc 


1260 


cgaaggtaac 


tggcttcagc 


agagcgcaga 


taccaaatac 


tgt.ccttcta 


gtgtagccgt 


. 1320 


agttaggcca 


ccacttcaag 


aactctgtag 


caccgcctac 


atacctcgct 


ctgctaatcc 


1380 


tgttaccagt 


ggctgctgcc 


agtggcgata 


agtcgtgtct 


taccgggttg 


gactcaagac 


1440 
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gatagttacc 


ggataaggcg 


cagcggtcgg 


gctgaacggg 


gggttcgtgc 


acacagccca 


1500 


gcttggagcg 


aacgacctac 


accgaactga 


gatacctaca 


gcgtgagcta 


tgagaaagcg 


1560 


ccacgcttcc 


cgaagggaga 


aaggcggaca 


ggtatccggt 


aagcggcagg 


gtcggaacag 


1620 


gagagcgcac 


gagggagctt 


ccagggggaa 


acgcctggta 


tctttatagt 


cctgtcgggt 


1680 


ttcgccacct 


ctgacttgag 


cgtcgatttt 


tgtgatgctc 


gtcagggggg 


cggagcctat 


1740 


ggaaaaacgc 


cagcaacgcg 


gcctttttac 


ggttcctggc 


cttttgctgg 


ccttttgctc 


1800 


acatgttctt 


tcctgcgtta 


tcccctgatt 


ctgtggataa 


ccgtattacc 


gcctttgagt 


1860 


gagctgatac 


cgctcgccgc 


agccgaacga 


ccgagcgcag 


cgagtcagtg 


agcgaggaag 


1920 


cggaagagcg 


cccaatacgc 


aaaccgcctc 


tccccgcgcg 


ttggccgatt 


cattaatgca 


1980 


gctggcacga 


caggtttccc 


gactggaaag 


cgggcagtga 


gcgcaacgca 


attaatgtga 


2040 


gttagctcac 


tcattaggca 


ccccaggctt 


tacactttat 


gcttccggct 


cgtatgttgt 


2100 


gtggaattgt 


gagcggataa 


caatttcaca 


caggaaacag 


ctatgaccat 


gattacgcca 


2160 


agctcgaaat 


taaccctcac 


taaagggaac 


aaaagctgga 


gctccaccgc 


ggtggcggcc 


2220 


gctctagaac 


tagtggatct 


tcttggctgt 


tattcaaaag 


gtccaacaat 


gtatatatat 


2280 


tggacatttt 


gaggcaatta 


tagatcctgg 


aaggcaattc 


tgattggtca 


ataaaaatcg 


. 2340 


atttcaatgc 


tatttttttt 


ttgtttttta 


tgagtttagc 


caatttatca 


tgaaaggtaa 


2400 


aaggggataa 


aggaaccgtg 


tgttgattgt 


cctgtaaata 


taagttgtct 


tcctccatat 


2460 


gtaaaaaggg 


aataaataaa 


tcaattaaat 


ttcgggatgc 


ttcatgaagt 


gcttctttcg 


2520 


gagttaaact 


tccgtttgtc 


catatttcga 


gaaaaagtat 


ctcttgtttt 


tcattcccat 


2580 


tcccataaga 


atgaatacta 


tgattcgcgt 


ttcgaacagg 


catgaataca 


gcatctatag 


2640 


gataacttcc 


atcttgaaag 


ttatgtggcg 


tttttataag 


atatccacga 


tttctctcta . 


.2700 


tttgtaatcc 


aatacaaaaa 


tcaattggtt 


ccgttaaact 


ggctatatgt 


tgtgtattat 


2760 


caacgatttc 


tacataaggc 


ggcaagatga 


tatcttgggc 


agttacagat 


ccaggaccct 


2820 


tgacacaaat 


agatgcgtca 


gaagttccat 


atagattact 


tcttaatata 


atttctttca 


2880 


aattcattaa 


aatttcatgt 


accgattctt 


gaatgcccgt. 


tatggtagaa- 


tattcatgtg 


2940 


ggactttctc 


agattttaca 


cgtgtgatac 


atgttccttc 


tatttctcca 


agtaaagctc 


3000 


ttcgcatcgc 


aatgcctatt 


gtgtcggctt 


ggcctttcat 


aagtggagac 


agaataaagc 


3060 


gtccataata 


aaggcgttta 


ctgtctgttc 


ttgattcaac 


acacttccac 


tgtagtgtcc . 


3120 


gagtagatac 


tgttactttc 


tctcgaacca 


tagtactatt 


atttgattag 


atcatcgaat 


' 3180 


cttttatttc 


tcttgagatt. 


tcttcaatgt 


tcagttctac 


acacgtcttt 


ttttcggagg 


3240 


tctacagcca 


ttatgtggca 


taggagttac 


atcccgtacg 


aaagttaa'ta 


gtataccact 


3300 


tcgacgaata 


gctcgtaatg 


ctgcatctct 


tccgagaccg 


ggacctttta^ 


tcatgacttc 


3360 
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tgctcgttgc ataccttgat ccactactgt acggatagcg tttgctgctg cggtttgagc 3420 

agcaaacggt gttcctcttc tcgtaccttt gaatccagaa gtaccggcgg aggaccaaga 3480 

aactactcga ccccgtacat ctgtaacagt gacaatggta ttattgaaac ttgcttgaac 354 0 

atgaataact ccctttggta ttctacgtgc acccttacgt gaaccaatac gtccattcct 3600 

acgcgaacta attttcggta tagcttttgc catattttat catctcgtaa atatgagtca 3660 

gagatatatg gatatatcca tttcatgtca aaacagattc tttatttgta catcggctct 3720 

tctggcaagt ctgattatcc ctgtctttgt ttatgtctcg ggttggaaca aattactata 3780 

attcgtcccc gcctacggat tagtcgacat ttttcacaaa ttttacgaac ggaagctctt 384 0 

attttcatat ttctcattcc ttaccttaat tctgaatcta tttcttggaa gaaaataagt 3900 

ttcttgaaat ttttcatctc gaattgtatt cccacgaaag gaatggtgaa gttgaaaaac 3960 

gaatccttca aatctttgtt gtggagtcga taaattatac gccctttggt tgaatcataa 4020 

ggacttactt caattttgac tctatctcct ggcagtatcc gtataaaact atgccggatc 4080 

tttcctgaaa cataatttat aatcagatcc aggaggacca tatgatcgcc gaagcggata 4140 

tggaggtctg ccgggagctg atccgcaccg gcagctactc cttccatgcg gcgtccagag 4200 

ttctgccggc gcgggtccgt gaccccgcgc tggcgcttta cgccttttgc cgcgtcgccg . 4260 

atgacgaagt cgacgaggtt ggcgcgccgc gcgacaaggc tgcggcggtt ttgaaacttg 4320 

gcgaccggct ggaggacatc tatgccggtc gtccgcgcaa tgcgccctcg gatcgggctt 4380 

tcgcggcggt ggtcgaggaa ttcgagatgc cgcgcgaatt gcccgaggcg ctgctggagg 444 0 

gcttcgcctg ggatgccgag gggcggtggt atcacacgct ttcggacgtg caggcctatt 4500 

cggcgcgggt ggcggccgcc gtcggcgcga tgatgtgcgt gctgatgcgg gtgcgcaacc 4560 

ccgatgcgct ggcgcgggcc tgcgatctcg gtcttgccat gcagatgtcg aacatcgccc 4620 

gcgacgtggg cgaggatgcc cgggcggggc ggcttttcct gccgaccgac tggatggtcg 4680 

aggaggggat cgatccgcag gcgttcctgg ccgatccgca gcccaccaag ggcatccgcc 4740 

gggtcaccga gcggttgctg aaccgcgccg accggcttta ctggcgggcg gcgacggggg 4800 

tgcggctttt gccctttgac tgccgaccgg ggatcatggc cgcgggcaag atctatgccg 4860 

cgatcggggc cgaggtggcg aaggcgaaat acgacaacat cacccggcgt gcccacacga 4 920 

ccaagggccg caagctgtgg ctggtggcga attccgcgat gtcggcgacg gcgacctcga 4980 

tgctgccgct ctcgccgcgg gtgcatgcca agcccgagcc cgaagtggcg catctggtcg 504 0 

atgccgccgc gcatcgcaac ctgcatcccg aacggtccga ggtgctgatc tcggcgctga 5100 

tggcgctgaa ggcgcgcgac cgcggcctgg cgatggattg aggatctaaa caaacccgga 5160 

acagaccgtt gggaagcgat tcagtaatta aagcttcatg actccttttt ggttcttaaa 5220 

gtccctttga ggtatcaact aataagaaag atattagaca accccccttt tttctttttc 5280 
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acaaatagga agtttcgaat ccaatttgga tattaaaagg attaccagat ataacacaaa 534 0 

atctctccac ctattccttc tagtcgagcc tctcggtctg tcattatacc tcgagaagta 5400 

gaaagaatta caatccccat tccacctaaa attcgcggaa ttcgttgata attagaatag 54 60 

attcgtagac caggtcgact gattcgtttt aaatttaaaa tatttctata gggtcttttc 5520 

ctattccttc tatgtcgcag ggttaaaacc aaaaaatatt tgtttttttc tcgatgtttt 5580 

ctcacgtttt cgataaaacc ttctcgtaaa agtatttgaa caatattttc ggtaatatta 564 0 

gtagatgcta ttcgaaccac cctttttcga tccatatcag catttcgtat agaagttatt 5700 

atctcagcaa tagtgtccct acccatgatg aactaaaatt attggggcct ccaaatttga 5760 

tataatcaac gtgtttttta cttatttttt ttttgaatat gatatgaatt attaaagata 5820 

tatgcgtgag acacaatcta ctaattaatc tatttctttc aaatacccca ctagaaacag 5880 

atcacaattt cattttataa tacctcggga gctaatgaaa ctattttagt aaaatttaat 594 0 

tctctcaatt cccgggcgat tgcaccaaaa attcgagttc cttttgattt ccttccttct 6000 

tgatcaataa caactgcagc attgtcatca tatcgtatta tcatcccgtt gtcacgtttg 6060 

agttctttac aggtccgcac aattacagct ctgactactt ctgatctttc taggggcata 6120 

tttggtacgg cttctttgat cacagcaaca ataacgtcac caatatgagc atatcgacga ... 6180 

ttgctagctc ctatgattcg aatacacatc aattctcgag ccccgctgtt atccgctaca 624 0 

tttaaatggg tctgaggttg aatcattttt ttaatccgtt ctttgaatgc aaagggcgaa 6300 

gaaaaaaaag aaatattttt gtccaaaaaa aaagaaacat gcggtttcgt ttcatatcta 6360 

agagcccttt ccgcattttt ttctattaca ttacgaaata atgaattgag ttcgtatagg 6420 

cattttagat gctgctagtg aaatagccct tctggctata ttttctgtta ctccacccat 6480 

ttcataaagt attcgacccg gtttaacaac agctacccaa tattcagggg atcccccggg 654 0 

ctgcaggaat tcgatatcaa gcttatcgat accgtcgacc tcgagggggg gcccggtacc 6600 

caattcgccc tatagtgagt cgtattacaa ttcactggcc gtcgttttac aacgtcgtga 6660 

ctgggaaaac cctggcgtta cccaacttaa tcgccttgca gcacatcccc ctttcgccag 6720 

ctggcgtaat agcgaagagg cccgcaccga tcgcccttcc caacagttgc gcagcctgaa 6780 

tggcgaatgg gacgcgccct gtagcggcgc attaagcgcg gcgggtgtgg tggttacgcg 684 0 

cagcgtgacc gctacacttg ccagcgccct agcgcccgct cctttcgctt tcttcccttc 6900 

ctttctcgcc acgttcgccg gctttccccg tcaagctcta aatcgggggc tccctttagg 6960 

gttccgattt agtgctttac ggcacctcga ccccaaaaaa cttgattagg gtgatggttc 7020 

acgtagtggg ccatcgccct gatagacggt ttttcgccct ttgacgttgg agtccacgtt 7080 

ctttaatagt ggactcttgt tccaaactgg aacaacactc aaccctatct cggtctattc 7140 

ttttgattta taagggattt tgccgatttc ggcctattgg ttaaaaaatg agctgattta 7200- 
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acaaaaattt aacgcgaatt ttaacaaaat attaacgctt acaatttagg tg 7252 

<210> 74 

<211> 14623 

<212> DNA 

<213> Artificial Sequence 
<220> 

<221> misc_f eature 

<222> ()..() 

<223> Plastic, transformation vector pHK08, containing Operon G, contain 
i 



<400> 74 
cacctaaatt 


gtaagcgtta 


atattttgtt 


aaaattcgcg ttaaattttt 


gttaaatcag 


60 


ctcatttttt 


aaccaatagg 


ccgaaatcgg 


caaaatccct tataaatcaa 


aagaatagac 


120 


cgagataggg 


ttgagtgttg 


ttccagtttg 


gaacaagagt ccactattaa 


agaacgtgga ' 


180 


ctccaacgtc 


aaagggcgaa 


aaaccgtcta 


tcagggcgat ggcccactac 


gtgaaccatc 


240 


accctaatca 


agttttttgg 


ggtcgaggtg 


ccgtaaagca ctaaatcgga 


accctaaagg 


,300 


gagcccccga 


tttagagctt 


gacggggaaa 


gccggcgaac gtggcgagaa 


aggaagggaa 


360 


gaaagcgaaa 


ggagcgggcg 


ctagggcgct 


ggcaagtgta gcggtcacgc 


tgcgcgtaac 


420 


caccacaccc 


gccgcgctta 


atgcgccgct 


acagggcgcg tcccattcgc 


cattcaggct 


480 


gcgcaactgt 


tgggaagggc 


gatcggtgcg 


ggcctcttcg ctattacgcc 


agctggcgaa 


540 


agggggatgt 


gctgcaaggc 


gattaagttg 


ggtaacgcca gggttttccc 


agtcacgacg 


600 


ttgtaaaacg 


acggccagtg 


aattgtaata 


cgactcacta tagggcgaat 


tgggtaccgg 


660 


gccccccctc 


gaggtcgacg 


gtatcgataa 


gcttgatatc gaattcctgc 


agcccggggg 


720 


atcttcttgg 


ctgttattca 


aaaggtccaa 


caatgtatat atattggaca 


ttttgaggca 


780 


attatagatc 


ctggaaggca 


attctgattg 


gtcaataaaa atcgatttca 


atgctatttt 


840 


ttttttgttt 


tttatgagtt 


tagccaattt 


atcatgaaag gtaaaagggg 


ataaaggaac 


900 


cgtgtgttga 


ttgtcctgta 


aatataagtt 


gtcttcctcc atatgtaaaa 


agggaataaa 


960 


taaatcaatt 


aaatttcggg 


atgcttcatg 


aagtgcttct ttcggagtta 


aacttccgtt 


1020 


tgtccatatt 


tcgagaaaaa 


gtatctcttg 


tttttcattc ccattcccat 


aagaatgaat 


1080 



BNSDOCID: <WO__^ 0210398A2J. > 



WO 02/10398 



PCT/US01/24037 



80 

actatgattc gcgtttcgaa caggcatgaa tacagcatct ataggataac ttccatcttg 1140 

aaagttatgt ggcgttttta taagatatcc acgatttctc tctatttgta atccaataca 1200 

aaaatcaatt ggttccgtta aactggctat atgttgtgta ttatcaacga tttctacata 1260 

aggcggcaag atgatatctt gggcagttac agatccagga cccttgacac aaatagatgc 1320 

gtcagaagtt ccatatagat tacttcttaa tataatttct. ttcaaattca ttaaaatttc 1380 

atgtaccgat tcttgaatgc ccgttatggt agaatattca tgtgggactt tctcagattt 14 40 

tacacgtgtg atacatgttc cttctatttc tccaagtaaa gctcttcgca tcgcaatgcc 1500 

tattgtgtcg gcttggcctt tcataagtgg agacagaata aagcgtccat aataaaggcg 1560 

tttactgtct gttcttgatt caacacactt ccactgtagt gtccgagtag atactgttac 1620 

tttctctcga accatagtac tattatttga ttagatcatc gaatctttta tttctcttga 1680 

gatttcttca atgttcagtt ctacacacgt ctttttttcg gaggtctaca gccattatgt 17 4 0 

ggcataggag ttacatcccg tacgaaagtt aatagtatac cacttcgacg aatagctcgt 1800 

aatgctgcat ctcttccgag accgggacct tttatcatga cttctgctcg ttgcatacct 1860 

tgatccacta ctgtacggat agcgtttgct gctgcggttt gagcagcaaa cggtgttcct 1920 

cttctcgtac ctttgaatcc agaagtaccg gcggaggacc aagaaactac tcgaccccgt 1980 

acatctgtaa cagtgacaat ggtattattg aaacttgctt gaacatgaat .aactcccttt 204 0 

ggtattctac gtgcaccctt acgtgaacca atacgtccat tcctacgcga actaattttc 2100 

ggtatagctt ttgccatatt ttatcatctc gtaaatatga gtcagagata tatggatata 2160 

tccatttcat gtcaaaacag attctttatt tgtacatcgg ctcttctggc aagtctgatt 2220 

atccctgtct ttgtttatgt ctcgggttgg aacaaattac tataattcgt ccccgcctac 2280 

ggattagtcg acatttttca caaattttac gaacggaagc tcttattttc. atatttctca 2340 

ttccttacct taattctgaa tctatttctt ggaagaaaat aagtttcttg aaatttttca 2400 

tctcgaattg tattcccacg aaaggaatgg tgaagttgaa aaacgaatcc ttcaaatctt 24 60 

tgttgtggag tcgataaatt atacgccctt tggttgaatc ataaggactt acttcaattt 2520 

tgactctatc tcctggcagt atccgtataa aactatgccg gatctttcct gaaacataat 2580 

ttataatcag atcggccgca ggaggagttc atatgtcaga gttgagagcc ttcagtgccc 2 64 0 

cagggaaagc gttactagct ggtggatatt tagttttaga tacaaaatat gaagcatttg 2700 

tagtcggatt atcggcaaga atgcatgctg tagcccatcc ttacggttca ttgcaagggt 2760 

ctgataagtt tgaagtgcgt gtgaaaagta aacaatttaa agatggggag tggctgtacc 2820 

atataagtcc taaaagtggc ttcattcctg tttcgatagg cggatctaag aaccctttca 2880 

ttgaaaaagt tatcgctaac gtat.ttagct actttaaacc taacatggac gactactgca 294 0 

atagaaactt gttcgttatt gatattttct ctgatgatgc ctaccattct caggaggata 3000 



WO 02/10398 



PCT/US01/24037 



81 

gcgttaccga acatcgtggc aacagaagat tgagttttca ttcgcacaga attgaagaag 3060 

ttcccaaaac agggctgggc tcctcggcag gtttagtcac agttttaact acagctttgg 3120 

cctccttttt tgtatcggac ctggaaaata atgtagacaa atatagagaa gttattcata 3180 

atttagcaca agttgctcat tgtcaagctc agggtaaaat tggaagcggg tttgatgtag 3240 

cggcggcagc atatggatct atcagatata gaagattccc acccgcatta atctctaatt 3300 

tgccagatat tggaagtgct acttacggca gtaaactggc gcatttggtt gatgaagaag 3360 

actggaatat tacgattaaa agtaaccatt taccttcggg attaacttta tggatgggcg 3420 

atattaagaa tggttcagaa acagtaaaac tggtccagaa ggtaaaaaat tggtatgatt 34 80 

cgcatatgcc agaaagcttg aaaatatata cagaactcga tcatgcaaat tctagattta 354 0 

tggatggact atctaaacta gatcgcttac acgagactca tgacgattac agcgatcaga 3600 

tatttgagtc tcttgagagg aatgactgta cctgtcaaaa gtatcctgaa atcacagaag 3660 

ttagagatgc agttgccaca attagacgtt cctttagaaa aataactaaa gaatctggtg 3720 

ccgatatcga acctcccgta caaactagct tattggatga ttgccagacc ttaaaaggag 3780 

ttcttacttg cttaatacct ggtgctggtg gttatgacgc cattgcagtg attactaagc 3840 

aagatgttga tcttagggct caaaccgcta atgacaaaag attttctaag gttcaatggc 3900 

tggatgtaac tcaggctgac tggggtgtta ggaaagaaaa agatccggaa acttatcttg . 3960 

ataaactgca ggaggagttt taatgtcatt accgttctta acttctgcac cgggaaaggt 4020 

tattattttt ggtgaacact ctgctgtgta caacaagcct gccgtcgctg ctag.tgtgtc 4080 

tgcgttgaga acctacctgc taataagcga gtcatctgca ccagatacta ttgaattgga 4140 

cttcccggac attagcttta atcataagtg gtccatcaat gatttcaatg ccatcaccga 4200 

ggatcaagta aactcccaaa aattggccaa ggctcaacaa gccaccgatg gcttgtctca 4260 

ggaactcgtt agtcttttgg atccgttgtt agctcaacta tccgaatcct tccactacca 4320 

tgcagcgttt tgtttcctgt atatgtttgt ttgcctatgc ccccatgcca agaatattaa 4380 

gttttcttta aagtctactt tacccatcgg tgctgggttg ggctcaagcg cctctatttc 4440 

tgtatcactg gccttagcta tggcctactt gggggggtta ataggatcta atgacttgga 4500 

aaagctgtca gaaaacgata agcatatagt gaatcaatgg gccttcatag gtgaaaagtg 4560 

tattcacggt accccttcag gaatagataa cgctgtggcc acttatggta atgccctgct 4 620 

atttgaaaaa gactcacata. atggaacaat aaacacaaac aattttaagt tcttagatga 4 680 

tttcccagcc attccaatga tcctaaccta tactagaatt ccaaggtcta caaaagatct 4740 

tgttgctcgc gttcgtgtgt tggtcaccga gaaatttcct gaagttatga agccaattct 4800 

agatgccatg ggtgaatgtg ccctacaagg cttagagatc atgactaagt taagtaaatg 4860 

taaaggcacc gatgacgagg ctgtagaaac taataatgaa ' ctgtatgaac aactattgga 4920 
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attgataaga ataaatcatg gactgcttgt 
acttattaaa aatctgagcg atgatttgag 
tggcggcggt tgctctttga ctttgttacg 
cttcaaaaag aaattgcaag atgattttag 
gactggctgc tgtttgttaa gcgcaaaaaa 
agtattccaa ttatttgaaa ataaaactac 
gccaggaaac acgaatttac catggacttc 
gctagtgtaa ctgctccggt aaatattgct 
aagttgaatc tgcccaccaa ttcgtccata 
acgttgacct ctgcggctac tgcacctgag 
gaaccacaca gcatcgacaa tgaaagaact 
agaaaggaaa tggaatcgaa ggacgcctca 
attgtctccg aaaataactt tcctacagca 
gctgcattgg tctctgcaat tgctaagtta 
tctagaatag caagaaaggg gtctggttca 
gcctgggaaa tgggaaaagc tgaagatggt 
agctctgact ggcctcagat gaaagcttgt 
gtgagttcca ctcagggtat gcaattgacc 
attgaacatg tcgtaccaaa gagatttgaa 
ttcgccacct ttgcaaagga aacaatgatg 
gactctttcc ctccaatatt ctacatgaat 
cacaccatta atcagtttta cggagaaaca 
aatgctgtgt tgtactactt agctgaaaat 
ttgtttggct ctgttcctgg atgggacaag 
aaccatcaat ttgaatcatc taactttact 
gatgttgcca gagtgatttt aactcaagtc 
ttgattgacg caaagactgg tctaccaaag 
ggcacatatg tctcagaacg tttacattgt 
ccagggttct ctatcctcca agacagcagt 
cttggctaag gttccagaat tggatgcatc 
cgttctttct gccaatttgg gccaagctcc 
gagtaatcat atcgttgcaa gcacagttaa 



82 







uuyyaL Lay a 






acaaaactta 


ccggtgctgg 


5040 


aaydy aba i_ i_ 


actcaagagc 


aaattgacag 


5100 




tttgaaacag 


acttgggtgg 


5160 


+" +" +* CtPi Pi t" Pi Pi Pi 
L. l_ Ltjaa Logo. 


gatcttaaaa 


tcaaatccct 


5220 


CclCaciciyQ^cicx 


caaattgacg atctattatt 


■J 4LO\J 


agacgaggag 


ttttaatgac 


tgtatatact 




aCLC L T-aayt 


attgggggaa 


aagggacacg 




tcagngacL c 


tategcaaga tgacctcaga 




tttgaacgcg 


acactttgtg gttaaatgga 


C C O A 


caaaattgt c 


tgcgcgacct 


aegecaatta 


jjoU 


ttgcccacat 


tatctcaatg 


gaaactccac 




gctggtttag 


cttcctccgc 


tgctggcttt 


n n 
j / uu 


naccaau tac 


cacagtcaac 


ttcagaaata 


fin 


get ugtiagai- 


cgttgtttgg 


eggataegtg 




*™i 4— -r^r ^* 4" 4" /"^ ^ 

CdLgati-cca 


tggcagtaca 


aategcagae 


COOf) 
. jOOU 


gtcctagttg 


teagegatat taaaaaggat 


D U 


gtggcaacct 


ccgaactatt 


taaagaaaga 


DUUU 


gtcatgcgta 


aagccattgt 


tgaaaaagat 


fin fin 


gattccaact 


ctttccatgc 


cacatgtttg 


fii 9 n 


gacactt cca 


agegtatcat 


cagttggtgc 


fii ft n 


a ucgt t gca l 


acacgtttga tgcaggtcca 


fi?4 n 


y dy Luy aadt 


tetttgeatt 


tatctataaa 


6300 


aaa l"f fanf a 


ctgagcagct 


tgaggctttc 


6360 


ara rrrt* rrP. a t* 
y L-aL.y ty aa l 


tggatcttga gttgcaaaag 


6420 




cacaagaaac 


aaacgaatct 


6480 


yaagdy y ag l 


tttaactcga 


cgccggcgga 


6540 


atcgactgcc 


agaaccccaa 


ttggttcatt 


6600 


ggaattgggt 


gctgttgctt 


taaaaggege 


6660 


caaggatttt 


gacgaaatta 


tttttggtaa 


6720 


ggccagacaa 


gttgctttgg ctgccggttt 


6780 


caaggtctgt 


gcatccgcta 


tgaaggcaat 


6840 
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cattttgggt gctcaatcca tcaaatgtgg taatgctgat gttgtcgtag ctggtggttg 6900 

tgaatctatg actaacgcac catactacat gccagcagcc cgtgcgggtg ccaaatttgg 6960 

ccaaactgtt cttgttgatg gtgtcgaaag agatgggttg aacgatgcgt acgatggtct 7020 

agccatgggt gtacacgcag aaaagtgtgc ccgtgattgg gatattacta gagaacaaca 7080 

agacaatttt gccatcgaat cctaccaaaa atctcaaaaa tctcaaaagg aaggtaaatt 714 0 

cgacaatgaa attgtacctg ttaccattaa gggatttaga ggtaagcctg atactcaagt 7200 

cacgaaggac gaggaacctg ctagattaca cgttgaaaaa ttgagatctg caaggactgt 7260 

tttccaaaaa gaaaacggta ctgttactgc cgctaacgct tctccaatca acgatggtgc 7320 

tgcagccgtc atcttggttt ccgaaaaagt tttgaaggaa aagaatttga agcctttggc 7380 

tattatcaaa ggttggggtg aggccgctca tcaaccagct gattttacat ,gggctccatc 7440 

tcttgcagtt ccaaaggctt tgaaacatgc tggcatcgaa gacatcaatt ctgttgatta 7500 

ctttgaattc aatgaagcct tttcggttgt cggtttggtg aacactaaga ttttgaagct 7560 

agacccatct aaggttaatg tatatggtgg tgctgttgct ctaggtcacc cattgggttg 7620 

ttctggtgct agagtggttg ttacactgct atccatctta cagcaagaag gaggtaagat 7680 

cggtgttgcc gccatttgta atggtggtgg tggtgcttcc tctattgtca ttgaaaagat 774 0 

atgaggatcc tctagatgcg caggaggcac atatggcgaa gaacgttggg attttggcta 7800 

tggatatcta tttccctccc acctgtgttc aacaggaagc tttggaagca catgatggag 7860 

caagtaaagg gaaatacact attggacttg gccaagattg tttagctttt tgcactgagc 7920 

ttgaagatgt tatctctatg agtttcaatg cggtgac'atc actttttgag- aagtataaga '7980 

ttgaccctaa ccaaatcggg cgtcttgaag taggaagtga gactigttatt gacaaaagca 8040 

agtccatcaa gaccttcttg atgcagctct ttgagaaatg tggaaacact gatgtcgaag 8100 

gtgttgactc gaccaatgct tgctatggtg gaactgcagc tttgttaaac tgtgtcaatt 8160 

gggttgagag taactcttgg gatggacgtt atggcctcgt catttgtact gacagcgcgg 8220 

tttatgcaga aggacccgca aggcccactg gaggagctgc agcgattgct atgttgatag 8280 

gacctgatgc tcctatcgtt ttcgaaagca aattgagagc aagccacatg gctcatgtct 8340 

atgactttta caagcccaat cttgctagcg agtacccggt tgttgatggt aagctttcac 8400 

agacttgcta cctcatggct cttgactcct gctataaaca tttatgcaac aagttcgaga 84 60 

agatcgaggg caaagagttc tccataaatg atgctgatta cattgttttc cattctccat 8520 

acaataaact tgtacagaaa agctttgctc gtctcttgta caacgacttc ttgagaaacg 858 0 

caagctccat tgacgaggct gccaaagaaa agttcacccc ttattcatct ttgacccttg 8640 

acgagagtta ccaaagccgt gatcttgaaa aggtgtcaca acaaatttcg aaaccgtttt 8700 

atgatgctaa agtgcaacca acgactttaa taccaaagga agtcggtaac atgtacactg 8760 
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cttctctcta cgctgcattt 
agcgggtggt tatgttctct 
tcaacgacaa taagcctcct 
gtaaattgaa agctagacat 
tggaacatag gtatggagca 
tggcaccggg aacttattat 
agaaaggtga agatggatct 
cacatatgca atgctgtgag 
gtccattgtt gcttgatggt 
tggttgctag cactaacaga 
ccgttcttaa ggacggtatg 
cttcggagct taagtttttc 
tcaacaggtc gagtagattt 
atgcttatgt aaggttctgt 
aaggtgtgca gaatgttctt 
gaatctctgg taacttctgt 
gtggtaaatc agttgtttgc 
aaacgagcgt ggctgcttta 
ttgcaggctc tctaggtgga 
tagctactgg ccaagatcca 
aagctattaa tgacggcaaa 
ggacagtggg aggaggaaca 
ttaaaggagc aagcacagag 
ccggagcagt tttagctgga 
tgagaagtca catgaaatac 
caacaacaac aacatgaccc 
tatgatgaag aacaattaag 
gtccctttaa gatatggaac 
cttttgcata gagcattctc 
cagcgtgcag aagagaaaat 
ccattggatg ttgctggtga 
aatgcagctc aacgcaagct 
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gcttccctca tccacaataa 
tatggaagtg gctccaccgc 
ttcagcattt caaacattgc 
gagtatgcac ctgagaagtt 
aaggactttg tgacaaccaa 
ctgaaagagg ttgattcctt 
gtagccaatg gacactgagg 
atgcctgttg gatacattca 
tatgagtact ctgttcctat 
ggctgcaagg ctatgtttat 
acccgagcac ctgttgttcg 
ttggagaatc cagagaactt 
gcaagactgc aaagtgttaa 
tgtagtactg gtgatgctat 
gagtatctta ccgatgattt 
tcggacaaga aacctgctgc 
gaggctgtaa tcagaggaga 
gtcgagctca acatgctcaa 
ttcaacgctc atgccagtaa 
gctcaaaacg tggagagttc 
gatatccata tctcagtcac 
cagcttgcat ctcaatcagc 
tcgccgggaa tgaacgcaag 
gagttatctt taatgtcagc 
aatagatcca gccgagacat 
gtaggaggca catatgagtt 
gttgatggaa gaagtttgta 
gaaaaaggag tgtcatttga 
tatgttcatc tttgatgagc 
tacatttcca tccttatgga 
acgtggtaat actttacctg 
gttccatgaa ttgggtattc 
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acacaatgat 


ttggcgggaa 


o oZ U 


aacaatgttc tcattacgcc 


QQQA 

ooo U 


atctgtaatg gatgttggcg 


o y 4 U 


tgtggagaca 


atgaagctaa 


yoou 


ggagggtatt 


atagatcttt 


yu bU 


gtaccggaga 


ttctatggca 


yizu 


atccgtcgag 


cacgtggagg 


m o n 
9180 


gattcctgtt 


gggattgctg 


no a r\ 

9z4 0 


ggctacaacc 


gaaggttgtt 


9 joo 


ctctggtggc 


gccaccagta 


yjou 


gttcgcttcg 


gcgagacgag 


y4 zu 


tgatactttg 


gcagtagtct 


94 80 


atgcacaatc 


gcggggaaga 


yo 4 u 


ggggatgaat 


atggtttcta 


yboo 


ccctgacatg 


gatgtgattg 


9660 


tgtgaactgg 


attgagggac . 


.9720 


gatcgtgaac aaggtcttga 


9780 


gaacctagct 


ggctctgctg 


no a r\ 
98 4 0 


catagtgtct 


gctgtattca 


QQAA 

yy uu 


tcaatgcatc 


accatgatgg 


y y bu 


tatgccatct 


atcgaggtgg 


i nn°n 


gtgtttaaac 


ctgctcggag 


iUUOU 


gaggctagcg acgatcgtag 


JL U JL ft u 


aattgcagct 


ggacagcttg 


1 AAAA 

XU^UU 


ctctggagca acgacaacga 


i no £n 

1 UZ DU 


cccaacaaga 


gaaaaaggat 


i n*3on 


tcgttgtaga tgaaaatgat 


i n^fin 

IUjoU 


tggaaaatat 


aaataaaggt 


10440 


aaaatCgcct 


tttacttcag 


10500 


cgaatacatg ttgctcccac 


10560 


aagctgttga aggtgttaag 


10620 


aagccaagta tattcccaaa 


10680 
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gacaaatttc agtttcttac acgaatccat taccttgctc ctagtactgg tgcttgggga 1074 0 

gagcatgaaa ttgactacat tcttttcttc aaaggtaaag ttgagctgga tatcaatccc 10800 

aatgaagttc aagcctataa • gtatgttact atggaagagt taaaagagat gttttccgat 10860 

cctcaatatg gattcacacc atggttcaaa cttatttgtg agcattttat gtttaaatgg 10920 

tggcaggatg tagatcatgc gtcaaaattc caagatacct taattcatcg ttgctaagga 10980 

tcccccggga tccggccgat ctaaacaaac ccggaacaga ccgttgggaa gcgattcagt 1104 0 

aattaaagct tcatgactcc tttttggttc ttaaagtccc tttgaggtat caactaataa 11100 

gaaagatatt agacaacccc ccttttttct ttttcacaaa taggaagttt cgaatccaat 11160 

ttggatatta aaaggattac cagatataac acaaaatctc tccacctatt ccttctagtc 11220 

gagcctctcg gtctgtcatt atacctcgag aagtagaaag aattacaatc cccattccac 11280 

ctaaaattcg cggaattcgt tgataattag aatagattcg tagaccaggt cgactgattc 1134 0 

gttttaaatt taaaatattt ctatagggtc ttttcctatt ccttctatgt cgcagggtta 11400 

aaaccaaaaa atatttgttt ttttctcgat gttttctcac gttttcgata aaaccttctc 114 60 

gtaaaagtat ttgaacaata ttttcggtaa tattagtaga tgctattcga accacccttt 11520 

ttcgatccat atcagcattt cgtatagaag ttattatctc agcaatagtg tccctaccca 11580 

tgatgaacta aaattattgg ggcctccaaa tttgatataa tcaacgtgtt ttttacttat 11640 

tttttttttg aatatgatat gaattattaa agatatatgc gtgagacaca atctactaat 11700 

taatctattt ctttcaaata ccccactaga aacagatcac aatttcattt tataatacct 11760 

cgggagctaa tgaaactatt ttagtaaaat ttaattctct caattcccgg gcgattgcac 11820 

caaaaattcg agttcctttt gatttccttc cttcttgatc aataacaact gcagcattgt 11880 

catcatatcg tattatcatc ccgttgtcac gtttgagttc tttacaggtc cgcacaatta 1194 0 

cagctctgac tacttctgat ctttctaggg gcatatttgg tacggcttct ttgatcacag 12000 

caacaataac gtcaccaata tgagcatatc gacgattgct agctcctatg attcgaatac 12060 

acatcaattc tcgagccccg ctgttatccg ctacatttaa atgggtctga ggttgaatca 12120 

tttttttaat ccgttctttg aatgcaaagg gcgaagaaaa aaaagaaata tttttgtcca 12180 

aaaaaaaaga aacatgcggt ttcgtttcat atctaagagc cctttccgca tttttttcta 1224 0 

ttacattacg aaataatgaa ttgagttcgt ataggcattt tagatgctgc tagtgaaata 12300 

gcccttctgg ctatattttc tgttactcca cccatttcat aaagtattcg acccggttta 12360 

acaacagcta cccaatattc aggggatcca ctagttctag agcggccgcc accgcggtgg 12420 

agctccagct tttgttccct ttagtgaggg ttaatttcga gcttggcgta atcatggtca 12480 

tagctgtttc ctgtgtgaaa ttgttatccg ctcacaattc cacacaacat acgagccgga 12540 

agcataaagt gtaaagcctg gggtgcctaa tgagtgagct aactcacatt aattgcgttg 12600 
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cgctcactgc ccgctttcca gtcgggaaac ctgtcgtgcc agctgcatta atgaatcggc 12660 

caacgcgcgg ggagaggcgg tttgcgtatt gggcgctctt ccgcttcctc gctcactgac 12720 

tcgctgcgct cggtcgttcg gctgcggcga gcggtatcag ctcactcaaa ggcggtaata 12780 

.cggttatcca cagaatcagg ggataacgca ggaaagaaca tgtgagcaaa aggccagcaa 12840 

aaggccagga accgtaaaaa ggccgcgttg ctggcgtttt tccataggct ccgcccccct 12 900 

gacgagcatc acaaaaatcg acgctcaagt cagaggtggc gaaacccgac aggactataa 12 960 

agataccagg cgtttccccc tggaagctcc ctcgtgcgct ctcctgttcc gaccctgccg 13020 

cttaccggat acctgtccgc ctttctccct tcgggaagcg tggcgctttc tcatagctca 13080 

cgctgtaggt atctcagttc ggtgtaggtc gttcgctcca agctgggctg tgtgcacgaa 13140 

ccccccgttc agcccgaccg ctgcgcctta tccggtaact atcgtcttga gtccaacccg 13200 

gtaagacacg acttatcgcc actggcagca gccactggta acaggattag cagagcgagg 13260 

tatgtaggcg gtgctacaga gttcttgaag tggtggccta actacggcta cactagaagg 13320 

acagtatttg gtatctgcgc tctgctgaag ccagttacct tcggaaaaag agttggtagc 13380 

tcttgatccg gcaaacaaac caccgctggt agcggtggtt tttttgtttg caagcagcag 13440 

attacgcgca gaaaaaaagg atctcaagaa gatcctttga tcttttctac ggggtctgac 13500 

gctcagtgga acgaaaactc acgttaaggg attttggtca tgagattatc aaaaaggatc 13560 

ttcacctaga tccttttaaa ttaaaaatga agttttaaat caatctaaag tatatatgag 13620 

taaacttggt ctgacagtta ccaatgctta atcagtgagg cacctatctc agcgatctgt 13680 

ctatttcgtt catccatagt tgcctgactc cccgtcgtgt agataactac gatacgggag 13740 

ggcttaccat ctggccccag tgctgcaatg ataccgcgag acccacgctc accggctcca 13800 

gatttatcag caataaacca gccagccgga agggccgagc gcagaagtgg tcctgcaact 13860 

ttatccgcct ccatccagtc tattaattgt tgccgggaag ctagagtaag tagttcgcca 13920 

gttaatagtt tgcgcaacgt tgttgccatt gctacaggca tcgtggtgtc acgctcgtcg 13980 

tttggtatgg cttcattcag ctccggttcc caacgatcaa ggcgagttac atgatccccc 14 04 0 

atgttgtgca aaaaagcggt tagctccttc ggtcctccga tcgttgtcag aagtaagttg 14100 

gccgcagtgt tatcactcat ggttatggca gcactgcata' attctcttac tgtcatgcca 14160 

tccgtaagat gcttttctgt gactggtgag tactcaacca agtcattctg agaatagtgt 14220 

atgcggcgac cgagttgctc ttgcccggcg tcaatacggg ataataccgc gccacatagc 14280 

agaactttaa' aagtgctcat cattggaaaa cgttcttcgg ggcgaaaact ctcaaggatc 14340 

ttaccgctgt tgagatccag ttcgatgtaa cccactcgtg cacccaactg atcttcagca 14400 
tcttttactt tcaccagcgt ttctgggtga gcaaaaacag ' gaaggcaaaa tgccgcaaaa : 14 4 60 

aagggaataa gggcgacacg gaaatgttga atactcatac tcttcctttt tcaatattat 14520 
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tgaagcattt atcagggtta ttgtctcatg agcggataca tatttgaatg tatttagaaa 14580 

'4 

aataaacaaa taggggttcc gcgcacattt ccccgaaaag tgc 14 623 

<210> 75 

<211> 7252 

<212> DNA 

<213> Artificial Sequence 
<220> 

<221> misc_feature 

<222> ()..() 

. <223> Plastid transformation vector pFH05 containing R. capsulatus DNA 
e 



<400> 75 
gcacttttcg 


gggaaatgtg cgcggaaccc 


ctatttgttt atttttctaa atacattcaa 


60 


atatgtatcc 


gctcatgaga caataaccct 


gataaatgct tcaataatat tgaaaaagga 


120 


agagtatgag 


tattcaacat ttccgtgtcg 


cccttattcc cttttttgcg gcattttgcc 


180 


ttcctgtttt 


tgctcaccca gaaacgctgg 


tgaaagtaaa agatgctgaa gatcagttgg 


240 


gtgcacgagt 


gggttacatc gaactggatc 


tcaacagcgg taagatcctt gagagttttc 


300 


gccccgaaga 


acgttttcca atgatgagca 


cttttaaagt tctgctatgt ggcgcggtat 


360 


tatcccgtat 


tgacgccggg caagagcaac 


tcggtcgccg catacactat tctcagaatg 


420 


acttggttga 


gtactcacca gtcacagaaa 


agcatcttac ggatggcatg acagtaagag 


480 


aattatgcag 


tgctgccata accatgagtg 


ataacactgc ggccaactta cttctgacaa 


540 


cgatcggagg 


accgaaggag ctaaccgctt 


ttttgcacaa catgggggat catgtaactc 


600 


gccttgatcg 


ttgggaaccg gagctgaatg 


aagccatacc aaacgacgag cgtgacacca 


660 


cgatgcctgt 


agcaatggca acaacgttgc 


gcaaactatt aactggcgaa ctacttactc 


720 


tagcttcccg 


gcaacaatta atagactgga 


tggaggcgga taaagttgca ggaccacttc 


780 


tgcgctcggc 


ccttccggct ggctggttta 


ttgctgataa atctggagcc ggtgagcgtg 


840 


ggtctcgcgg 


tatcattgca gcactggggc 


cagatggtaa gccctcccgt atcgtagtta 


900 


tctacacgac 


ggggagtcag gcaactatgg 


atgaacgaaa tagacagatc gctgagatag 


960 


gtgcctcact 


gattaagcat tggtaactgt 


cagaccaagt ttactcatat atactttaga 


1020 
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ttgatttaaa 


acttcatttt 


taatttaaaa 


ggatctaggt 


gaagatcctt 


4-J-4- 4_ j_ 

. LLtgataatic 


i nfln 
lUoU 


tcatgaccaa 


aatcccttaa 


— , ~4- „ — ,-,4- 4-4-4- 

cgtgagtttt 


cgttccactg agegtcagae 


cccgtagaaa 


n /in 


agatcaaagg 


atcttcttga 


+- 4-4-4-4- 

gatccttttt 


ttctgcgcgt 


aatctgctgc 


ttgcaaacaa 


i oftn 


aaaaaccacc 


gctaccagcg 


grggt till gut: 


tgeeggatea 


agagctacca 


= o4- 4- +- 4- 4* r> 


_LZ DU 


cgaaggtaac 


tggctt cage 


agagegcaga 


taccaaatac 


tgtccttcta 


gty Layccy u 


1 "39ft 


agttaggcca 


ccacttcaag 


aactctgtag 


caccgcctac atacctcgct 


CLyCtaatCC 




tgttaccagt 


ggctgctgcc 


agtggcgata 


agtcgtgtct 


taccgggttg 


gactcaagac 


i a a r\ 


gatagttacc 


ggataaggcg 


cageggtegg 


getgaaeggg gggttcgtgc 


acacagccca 


louu 


gcttggagcg 


aacgacctac 


accgaactga 


gatacctaca gcgtgagcta 


tgagaaagcg 


1 ccn 

lo bu 


ccacgcttcc 


cgaagggaga 


aaggeggaca 


ggtatccggt 


aageggcagg 


gueggaacag 


1 C9fl 
1 DZ U 


gagagcgcac 


gagggagctt 


ccagggggaa 


acgcctggta 


tctttatagt 


cctgtcgggt 


neon 

1680 


ttcgccacct 


ctgacttgag 


cgtcgatttt 


tgtgatgctc 


gtcagggggg 


eggagectat 


1 /4U 


ggaaaaacgc 


cagcaacgcg 


gectttttae 


ggttcctggc 


ettttgetgg 


ccttttgctc 


1800 


acatgttctt 


tectgegtta 


tcccctgatt 


ctgtggataa 


ccgtattacc 


gectttgagt 


n o r~ f\ 

1860 


gagctgatac 


cgctcgccgc 


agecgaaega 


ccgagcgcag 


cgagtcagtg 


agegaggaag 


i o o n 

±yzu 


cggaagagcg 


cccaatacgc 


aaaccgcctc 


tccccgcgcg' ttggccgatt 


cattaatgea 


. iybu 


gctggcacga 


caggtttccc 


gactggaaag 


cgggcagtga 


gcgcaacgca 


a u uaa ugnga 


Z U fi u 


gttagctcac 


tcattaggca 


ccccaggctt 


tacactttat 


gcttccggct 


— , -_4 - 4_ — j_ 4- _4- 

cgtaLgr ugt: 


oi An 
ZlUU 


gtggaattgt 


gageggataa 


caatttcaca 


caggaaacag 


ctatgaccat 


gattacgeca 


oi en 
DU 


agctcgaaat 


taaccctcac 


taaagggaac 


aaaagctgga 


gctccaccgc 


ggtggcggcc 


9 oo ft 


gctctagaac 


tagtggatct 


tcttggctgt 


tattcaaaag 


gtccaacaat 


_4_ _4__4 .4-_4_ 

gtatatatat 


OODfl 


tggacatttt 


gaggcaatta 


tagatcctgg 


aaggcaattc 


tgattggtca 


ataaaaatcg 


4 U 


atttcaatgc 


tatttttttt 


4_ j- _ i_ 4_ 4_ 4_ 4- 4- _ 

ttgtttttta 


tgagtttagc 


caatttatca 


ugaaaggnaa 


0 4 ftp. 


aaggggataa 


aggaacegtg 


tgttgattgt 


cctgtaaata 


taagttgtct 


tcctccatat. 


9/1 £ft 
Z4 DU 


gtaaaaaggg 


aataaataaa 


tcaatt aaat 


ttcgggatgc 


ttcatgaagt 


yCLUCtt teg 


9 ^9 ft 
Z D U 


gagttaaact 


tccgtttgtc 


canart ucga 


gaaaaagtat 


ctcttgtttt 


LCaLUCCCdt 


Z D 0 U 


tcccataaga 


atgaatacta 


tgattcgcgt 


ttcgaacagg. 


catgaataca 


gcatctatag 


0 fid ft 

z bfi u 


gataacttcc 


atcttgaaag 


ttatgtggcg 


tttttataag 


atatccacga 


tttctctcta 


2700 


tttgtaatcc 


aatacaaaaa 


tcaattggtt 


ccgttaaact 


ggctatatgt 


tgtgtattat 


2760 


caacgatttc 


tacataaggc 


ggcaagatga 


tatcttgggc 


agttacagat 


ccaggaccct 


2820 


tgacacaaat 


agatgegtea 


gaagttccat 


atagattact 


tcttaatata 


atttctttca 


2880 


aattcattaa 


aatttcatgt 


accgattctt 


gaatgcccgt 


tatggtagaa 


tattcatgtg 


294 0 
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ggactttctc agattttaca cgtgtgatac atgttccttc tatttctcca agtaaagctc 3000 

ttcgcatcgc aatgcctatt gtgtcggctt ggcctttcat aagtggagac agaataaagc 3060 

gtccataata aaggcgttta ctgtctgttc ttgattcaac acacttccac tgtagtgtcc 3120 

gagtagatac tgttactttc tctcgaacca tagtactatt atttgattag atcatcgaat 3180 

cttttatttc tcttgagatt tcttcaatgt tcagttctac acacgtcttt ttttcggagg 324 0 

tctacagcca ttatgtggca taggagttac atcccgtacg aaagttaata gtataccact 3300 

tcgacgaata gctcgtaatg ctgcatctct tccgagaccg ggacctttta tcatgacttc 3360 

tgctcgttgc ataccttgat ccactactgt acggatagcg tttgctgctg cggtttgagc 3420 

agcaaacggt gttcctcttc tcgtaccttt gaatccagaa gtaccggcgg aggaccaaga 3480 

aactactcga ccccgtacat ctgtaacagt gacaatggta ttattgaaac ttgcttgaac 3540 

atgaataact ccctttggta ttctacgtgc acccttacgt gaaccaatac gtqcattcct 3600 

acgcgaacta attttcggta tagcttttgc catattttat catctcgtaa atatgagtca 3660 

gagatatatg gatatatcca tttcatgtca aaacagattc tttatttgta catcggctct 3720 

tctggcaagt ctgattatcc ctgtctttgt ttatgtctcg ggttggaaca aattactata 3780 

attcgtcccc gcctacggat tagtcgacat ttttcacaaa ttttacgaac ggaagctctt 3840 

attttcatat ttctcattcc ttaccttaat tctgaatcta tttcttggaa gaaaa.taagt 3900 

ttcttgaaat ttttcatctc gaattgtatt cccacgaaag gaatggtgaa gttgaaaaac 3960 

gaatccttca aatctttgtt gtggagtcga taaattatac gccctttggt tgaatcataa 4020 

ggacttactt caattttgac tctatctcct ggcagtatcc" gtataaaact atgccggatc .4080 

tttcctgaaa cataatttat aatcagatcc aggaggacca tatgatcgcc gaagcggata 414 0 

tggaggtctg ccgggagctg atccgcaccg gcagctactc cttccatgcg gcgtccagag 4200 

ttctgccggc gcgggtccgt gaccccgcgc tggcgcttta cgccttttgc cgcgtcgccg 4260 

atgacgaagt cgacgaggtt ggcgcgccgc gcgacaaggc tgcggcggtt ttgaaacttg 4320 

gcgaccggct ggaggacatc tatgccggtc gtccgcgcaa tgcgccctcg gatcgggctt 4380 

tcgcggcggt ggtcgaggaa ttcgagatgc cgcgcgaatt gcccgaggcg ctgctggagg 4440 

gcttcgcctg ggatgccgag gggcggtggt atcacacgct ttcggacgtg caggcctatt 4500 

cggcgcgggt ggcggccgcc gtcggcgcga tgatgtgcgt gctgatgcgg gtgcgcaacc 4560 

ccgatgcgct ggcgcgggcc tgcgatctcg gtcttgccat gcagatgtcg aacatcgccc 4 620 

gcgacgtggg cgaggatgcc cgggcggggc ggcttttcct gccgaccgac tggatggtcg 4 680 

aggaggggat cgatccgcag gcgttcctgg ccgatccgca gcccaccaag ggcatccgcc 4 74 0 

gggtcaccga gcggttgctg aaccgcgccg accggcttta ctggcgggcg gcgacggggg 4800 

tgcggctttt gccctttgac tgccgaccgg ggatcatggc cgcgggcaag atctatgccg 4860 
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cgatcggggc cgaggtggcg aaggcgaaat acgacaacat cacccggcgt gcccacacga 4920 

ccaagggccg caagctgtgg ctggtggcga attccgcgat gtcggcgacg gcgacctcga 4 980 

tgctgccgct ctcgccgcgg gtgcatgcca agcccgagcc cgaagtggcg catctggtcg 5040 

atgccgccgc gcatcgcaac ctgcatcccg aacggtccga ggtgctgatc tcggcgctga 5100 

tggcgctgaa ggcgcgcgac cgcggcctgg cgatggattg aggatctaaa caaacccgga 5160 

acagaccgtt gggaagcgat tcagtaatta aagcttcatg actccttttt ggttcttaaa 5220 

gtccctttga ggtatcaact aataagaaag atattagaca accccccttt tttctttttc 5280 

acaaatagga agtttcgaat ccaatttgga tattaaaagg attaccagat ataacacaaa 5340 

atctctccac ctattccttc tagtcgagcc tctcggtctg tcattatacc tcgagaagta 5400 

gaaagaatta caatccccat tccacctaaa attcgcggaa ttcgttgata attagaatag 54 60 

attcgtagac caggtcgact gattcgtttt aaatttaaaa tatttctata gggtcttttc 5520 

ctattccttc tatgtcgcag ggttaaaacc aaaaaatatt tgtttttttc tcgatgtttt 5580 

ctcacgtttt cgataaaacc ttctcgtaaa agtatttgaa caatattttc ggtaatatta 5640 

gtagatgcta ttcgaaccac cctttttcga tccatatcag catttcgtat agaagttatt 5700 

atctcagcaa tagtgtccct acccatgatg aactaaaatt attggggcct ccaaatttga -5760 

tataatcaac gtgtttttta cttatttttt ttttgaatat gatatgaatt attaaagata 5820 

tatgcgtgag acacaatcta ctaattaatc tatttctttc aaatacccca ctagaaacag 5880 

atcacaattt cattttataa tacctcggga gctaatgaaa ctattttagt aaaatttaat 594 0 

tctctcaatt cccgggcgat tgcaccaaaa attcgagttc cttttgattt ccttccttct 6000 

tgatcaataa caactgcagc attgtcatca tatcgtatta tcatcccgtt gtcacgtttg 6060 

agttctttac aggtccgcac aattacagct ctgactactt ctgatctttc taggggcata 6120 

tttggtacgg cttctttgat cacagcaaca ataacgtcac caatatgagc atatcgacga 6180 

ttgctagctc ctatgattcg aatacacatc aattctcgag ccccgctgtt atccgctaca 6240 

tttaaatggg tctgaggttg aatcattttt ttaatccgtt ctttgaatgc aaagggcgaa 6300 

gaaaaaaaag aaatattttt gtccaaaaaa aaagaaacat gcggtttcgt ttcatatcta 6360 

agagcccttt ccgcattttt ttctattaca ttacgaaata atgaattgag ttcgtatagg 6420 

cattttagat gctgctagtg aaatagccct tctggctata ttttctgtta ctccacccat 6480 

ttcataaagt attcgacccg gtttaacaac agctacccaa tattcagggg atcccccggg 6540 

ctgcaggaat tcgatatcaa gcttatcgat accgtcgacc tcgagggggg gcccggtacc 6600 

caattcgccc tatagtgagt cgtattacaa ttcactggcc gtcgttttac aacgtcgtga 6660 

ctgggaaaac cctggcgtta cccaacttaa tcgccttgca gcacatcccc ctttcgccag 6720 

ctggcgtaat agcgaagagg cccgcaccga tcgcccttcc caacagttgc gcagcctgaa 6780 
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tggcgaatgg gacgcgccct gtagcggcgc attaagcgcg gcgggtgtgg tggttacgcg 6840 

cagcgtgacc gctacacttg ccagcgccct agcgcccgct cctttcgctt tcttcccttc 6900 

ctttctcgcc acgttcgccg gctttccccg tcaagctcta aatcgggggc tccctttagg 6960 

gttccgattt agtgctttac ggcacctcga ccccaaaaaa cttgattagg gtgatggttc 7020 

acgtagtggg ccatcgccct gatagacggt ttttcgccct ttgacgttgg agtccacgtt 7080 

ctttaatagt ggactcttgt tccaaactgg aacaacactc aaccctatct cggtctattc 7140 

ttttgattta taagggattt tgccgatttc ggcctattgg ttaaaaaatg agctgattta 7200 

acaaaaattt aacgcgaatt ttaacaaaat attaacgctt acaatttagg tg 7252 

<210> 76 
<211> 14623 
<212> DNA 

<213> Artificial Sequence 

<220> * 
<221> . misc_f eature . . . • _ . - . • • 

<222> () . . () - 

<223> Plastid transformation vector pFH06, containing Operon E, contain 
i 



<400> 76 
cacctaaatt 


gtaagcgtta 


atattttgtt 


aaaattcgcg ttaaattttt 


gttaaatcag 


60 


ctcatttttt 


aaccaatagg 


ccgaaatcgg 


caaaatccct tataaatcaa 


aagaatagac 


120 


cgagataggg 


ttgagtgttg 


ttccagtttg 


gaacaagagt ccactattaa 


agaacgtgga 


180 


ctccaacgtc 


aaagggcgaa 


aaaccgtcta 


tcagggcgat ggcccactac 


gtgaaccatc 


-240 


accctaatca 


agttttttgg 


ggtcgaggtg 


ccgtaaagca ctaaatcgga 


accctaaagg 


300 


gagcccccga 


tttagagctt 


gacggggaaa 


gccggcgaac gtggcgagaa 


aggaagggaa 


360 


gaaagcgaaa 


ggagcgggcg 


ctagggcgct 


ggcaagtgta gcggtcacgc 


tgcgcgtaac 


420 


caccacaccc 


gccgcgctta 


atgcgccgct 


acagggcgcg tcccattcgc 


cattcaggct 


480 


gcgcaactgt 


tgggaagggc 


gatcggtgcg 


ggcctcttcg ctattacgcc 


agctggcgaa 


540 


agggggatgt 


gctgcaaggc 


gattaagttg 


ggtaacgcca gggttttccc 


agtcacgacg 


600 


ttgtaaaacg 


acggccagtg 


aattgtaata 


cgactcacta tagggcgaat 


tgggtaccgg 


660 
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ft r> i~\ 4— 

g ccccccclc 


gaggucgacg 


ft \ • *a 4- r~\ ri n 4- ^ ri 4- 4- /-y 4- «-i 4- <^r "zs 4- 4- ^ 4— c*x 
gtaLCyduaa get igatatC yaatLCOLyu 


a y i_y y y y y 


790 


r~\ 4- /-^•+-'4- y^< 4— s"r/*Y 

accT-ucxtyg 


s-* 4- /-t 4- 4- aff/is 


a a a rrrr f /ti a a /">as ai +■ rr4- ai 4~ a +■ a'Ha'Hrrrfafa 
adayyCCCao CaaLyLaCaL aUaL Ly y dv_d 


4-4-4-4-<~r-3/-f/-*' ( r-ii3 

lll uydyyuo 


7 o n 


attatagatc 


ctggaaggca 


at uctya utg gtcaauaaaa aicgaLuucd 


4- rrr-4- 3 4-4-4-4- 


ft Zl fi 


4- j_ 4_ 4_ 4_ ^- 4_ 4. 
U L LL LL yuX. U 


4-4-4- -i 4- /-» -i rx\- 4- 


LayCCadLUL aLCaL yaaay y Laaaay y y y 


dLdddyyddL 




cgtgtgttga 


ttgtcctgta 


aatataaguL gncLLCCi.cc aLaLgraaaa 


ayyyaaLdaa 


you 


taaatcaatt 


aaatttcggg 


atgcttcatg aagtgcttct ttcggagtta 


"^i 4— 4— /-^ 4- 4— 

aacLLCcgL L 




tgtccatatt 


tcgagaaaaa 


g ta lClc LL.g l.l ll l ca ll c ccaLtcccaL 


ddydatyddL 


1 nan 


actatgattc 


gcgtttcgaa 


caggcatgaa tacagcatct ataggataac 


ttccatcttg 


1 1 /in 


aaagttatgt 


ggcgttttta 


taagatatcc acgatttctc tctatttgta 


at ccaataca 


1 nnn 
1<:UU 


aaaatcaatt 


ggttccgtta 


aactggctat atgttgtgta ttatcaacga 


4- 4- 4* 4- —% —\ 4— -*i 


1 ocn 


aggcggcaag 


atgatatctt 


gggcagttac agatccagga cccttgacac 


aaatagatgc 




gtcagaagtt 


ccatatagat 


tacttcttaa tataatttct ttcaaatLca 


4_4___ __4_4_4__, 

ttaaaat llc 




atgtaccgat 


tcttgaatgc 


ccgttatggt agaatattca tgtgggactt 


tctcagattt 


t >i a n 


tacacgtgtg 


atacatgttc 


cttctatttc tccaagtaaa gctcttcgca 


tcgcaatgcc 


i cnn 
loUU 


tattgtgtcg 


gcttggcctt 


tcataagtgg agacagaata aagcgtccat 


aataaaggcg 


IodU 


tttactgtct ,gt\tcttgatt 


.caacacactt ccactg.tagt. gtccgagtag 


aLacLgtiTiac - 


n con 
. 1 0 Z U 


tttctctcga 


accatagtac 


tattatttga ttagatcatc gaatctttta 


tttctcttga 


IDOU 


gatttcttca 


atgttcagtt 


ctacacacgt ctttttttcg gaggtctaca 


gcca l t.a tig l 


1 1 ah 

JL / 4 U 


ggcataggag ttacatcccg 


tacgaaagtt aatagtatac cacttcgacg 


aatagct cgt 


1 q nn 
10 uu 


aatgctgcat ctcttccgag 


accgggacct tttatcatga cttctgctcg 


ttgcatacct 


1 o(;n 


tgatccacta 


ctgtacggat 


agcgtttgct gctgcggttt gagcagcaaa. 

j 


cggLgiz lccl 


1 Qon* 

J. i^^: U 


cttctcgtac ctttgaatcc 


agaagtaccg gcggaggacc aagaaactac 


tcyaccccy l 




acatctgtaa 


cagtgacaat 


gguatLauLy aaaccuyCLL yaa.LaLyd.ciL 


aa^f rrrf^i" 




ggtattctac gtgcaccctt 


acgLgaacca auacyLCCai. LccLacycya 


dLLddLLLLL 


X W U 


ggtatagctt ttgccatatt 


LLdiCdiCT-L yi.aaaLdi.gd y LLagaydid 


L-d. Ly yd La Lq 


?i fin 


tccatttcat 


gtcaaaacag 


— ^4-4- i^i4-4— 4- •— ^4-~f— 4- r*r 4- ^n^^n n /*<» /i 4- ■ 4~ 4~ /*«4~ <^r/-r/-^ 

attCLttaut ty LacaLcyy cuctLcugyc 


aarr4-/-i4-rTr3i"4- 
ddy LLLydLL 




atccctgtct 


ttgtttatgt 


ctcgggttgg aacaaattac tataattcgt 


ccccgcctac 


ZZo U 


ggattagtcg 


acatttttca 


caaattttac gaacggaagc tcttattttc 


atatttctca 


2340 


ttccttacct taattctgaa 


tctatttctt ggaagaaaat aagtttcttg 


aaatttttca 


2400 


tctcgaattg tattcccacg 


aaaggaatgg tgaagttgaa aaacgaatcc 


ttcaaatctt 


2460. 


tgttgtggag tcgataaatt 


atacgccctt tggttgaatc ataaggactt 


acttcaattt 


2520 


tgactctatc tcctggcagt 


atccgtataa aactatgccg gatctttcct 


gaaacataat 


2580 
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ttataatcag 


atccrcrccaca Qoaacraattc atatairaaa attcraaaarr* 


U. LLCiy Ly L>LL/ 




cacrqcraaaac 


gttactagct ggtggatatt tagttttaga tacaaaatat 


oaaarai"l" I - rr 

y uay >^ci ll. LU 


9700 


tagtcggatt 


atcggcaaga atgcatgctg tagcccatcc ttacggttca 


1" +* eica a nrrrri" 
*— uy Laayyy L 


/ t> LI 


ctgataagtt 


tgaagtgcgt gtgaaaagta aacaatttaa agatggggag 




O ^ \J 


atataagtcc 


taaaagtggc ttcattcctg tttcgatagg cggatctaag 


aaccctttca 


2880 


ttgaaaaagt 


tatcgctaac gtatttagct actttaaacc taacatggac 


oactact crca 1 


2940 


atagaaactt 


gttcgttatt gatattttct ctgatgatgc ctaccattct 


caggaggata 


3000 


gcgt taccga 


acatcataac aacaaaaaat taaattttca ttcocacacra 


a*t"l" rra a era an 
ex l. uyctayciciy 




ttcccaaaac 


aQQactocrac tcctcaacaa atttaatcac aattttaart* 


a P^ rrr*"t" t" t" rrrr 
a. a.y l, l. v_ Lyy 




cctccttttt 


tcrtatCQCrac ctcrcraaaata at~crtacrar , aa at"a1~arracf?i;3 


y L. lol LLaLa 




a t t tagcaca 


acfttcrctcat totcaaactc a a a at a a a at t nnaanrrfncr 

3 i»y \* ovju o k.yu^uuyi».n-« ^33 ^^^ LyUCluy^yuu 


uuuya.i_yL.cty 


O/- ft u 




a +" a "trrna t* r*t" a1~r , 3ria"h?i't~3 rr^^i na "h +" rrr" apppnraf f a 
aLQLyyaL^L a t^ay a La La Ljaaya L LLuL. aL-CuyL-a L La 


aiCUCUaaLt 


JJUU 


tcrccacratat 


tcrcraacrt~cjct ar*i~i~apn'ffr*a ^rt*aaa^ , 1"^r^T^ , ncai~1~T- frcfHt" 
y _7 q vL uaL>^yLa y LuQu L- Ly y L ulq l l luu l l 


na'fna artaarr 
yaUyaciycldy 


jj DU 


actggaat at 


taccrattaaa acTtaacpai" t" "t"appt"hprrrTrT af'ha.ap'h'hl-a 

uu^^a^ uaaa ay LauLLaL L LuLiLL LUyy^ uLLadLLLLcl 






atattaaaaa 


^y y l j auay uaaaau Ly y LLLayaa Ljy Laaaaad L 


uyy LaLyaLT. 




cgcatatgcc 


acraaacrcttcT aaaatatata caaaantccra tcal-nr^aaat* 


LLLaya U U La 


JJ^U 


tggatggact 


atctaaacta aatcactfac arcrafrartra ^-nanrTa+*■h^^r , 


ay L,ya Luay a 


JDUU 


tatttgagtc 


tcttaacraaa aataactata cctatcaaaa atatcctoaa 


atrararraaft 


JUUU 


ttaaaaatac 


acrttcrccaca attaaacatt cctttacraaa aataactaaa 


fia a t" P+" rrrri" rr 
y aa ll Lyy Ly 




ccCTatatcaa 


aoouuooy La oaaaLLayLL La L Lyya Lya L LyLLayaLL- 


. U Ladaayydy 




ttcttactta 


^ l. uau. lqi^l l yy Ly^Lyy Ly y LLaLyctL/yU L»a LLLJLayLy 


aLLaL Laayt 


^ a a n 

JO 4 U 


aagatgttga 


tcttacrCTCict caaaccnrf a ai*naraaaan a+"f"i"1" ci - aarr 

L.\^v.uuyyy^>u waa ti v^^y v» ca uLyauaaQCtu Q LL L LLLQaU 


y U LLaa uyyL 


J-?UU 


tgga tgtaac 


tcaaactaac taacratatta aaaaaaaaaa acrafrrrrrraa ' 

w^uv y y y 3 3 i— ^ yy wua y ciuua ciyciL. ^ v«-y y a q 


GL L La LL L Ly 


Jt?OU 


ataaactgca 


QQaaaacrttt taatatcatt accattctta acttctacac 


pfyrrfraaarrrff - ' 


4020 


tattattttt 


CTataaacact ctactatata caacaaarrf rrr , r , rr't-r'rTr , 'l-fT 


l Lay Ly Ly LL 


4 no n 

y uo u 


tacattaaaa 


acctacctac taataaacaa atca*t"ct"rrpa rr , arr?it-ar , -t-?i 


u l y da. l Ly ya 


4 1 4 n 


cttcccggac 


attagcttta atcataagtg gtccatcaat gatttcaatg 


LLa LLaLuya 


4200 


ggatcaagta 


aactcccaaa aattggccaa ggctcaacaa gccaccgatg 


gcttgtctca 


4 260 


ggaactcgtt 


agtcttttgg atccgttgtt agctcaacta tccgaatcct 


tccactacca 


4320 


tgcagcgttt 


tgtttcctgt atatgtttgt ttgcctatgc ccccatgcca 


agaatattaa 


4380 


gttttcttta 


aagtctactt tacccatcgg tgctgggttg ggctcaagcg 


cctctatttc 


4440 


tgtatcactg 


gccttagcta tggcctactt gggggggtta ataggatcta 


atgacttgga 


4500 
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aaagctgtca 


gaaaacgata 


agcatatagt 


94 

gaatcaatgg 


gccu tea tag 


gugaaaagrg 


a ^ £n 
4 o bu 


tattcacggt 


accccttcag 


gaatagataa 


cgctgtggcc 


ac u ta tggca 


augcccugcL 


^ DZ u 


atttgaaaaa 


gactcacata 


atggaacaat 


aaacacaaac 


a a T, r n n a a g u 


4" 4* 4* "T* ^ 4^ ^ 

t cttaga uga 


41 DO U 


tttcccagcc 


attccaatga 


tcctaaccta 


tactagaatt 


ccaaggt ct a 


caaaagatct 


AH AC\ 


tgttgctcgc 


gttcgtgtgt 


tggtcaccga 


gaaatttcct 


gaagttatga 


agecaatt ct 


4 o UU 


agatgccatg 


ggtgaatgtg 


ccctacaagg 


cttagagatc 


atgactaagt 


taagtaaatg 


a q tzr\ 

4 0 OU 


taaaggcacc 


gatgacgagg 


ctgtagaaac 


taataatgaa 


ctgtatgaac 


aactattgga 


4 yzu 


attgataaga 


ataaatcatg 


gactgcttgt 


ctcaatcggt 


gt met catc 


ctggattaga 


4 you 


acttattaaa 


aatctgagcg 


atgatttgag 


aattggctcc 


acaaaactta 


ccggtgctgg 


c r\ a r\ 


tggcggcggt 


tgctctttga 


_ 4_ 4- 4_ — J_ i_ _ _ 

ctttgttacg 


aagagacatt 


act caa gage 


aaa t tgacag 


D1UU 


cttcaaaaag 


aaattgcaag 


augatxtxag 


ttacgagaca 


t t t gaaacag 


act ugggegg 


fin 


gactggctgc 


tgtttgttaa 


gcgcaaaaaa 


tttgaataaa 


gatcttaaaa 


tcaaatccct 


coon 


agtattccaa 


ttatttgaaa 


ataaaactac 


cacaaagcaa 


caaattgacg 


atctattatt. 


c o o d 


gccaggaaac 


acgaatttac 


catggacttc 


agacgaggag 


ttttaatgac 


tgtatatact 




gctagtgtaa 


ctgctccggt 


aaatattgct 


actcttaagt 


attgggggaa 


aagggacacg 


04UU 


aagttgaatc 


tgcccaccaa 


ttcgtccata 


tcagtgactt 


tategcaaga 


tgacctcaga 


D4J>y 


acgttgacct 


ctgcggctac 


tgcacctgag 


tttgaacgcg 


acactttgtg 


gttaaatgga 


c: t; o n 


gaaccacaca 


gcatcgacaa 


tgaaagaact 


caaaattgtc 


tgcgcgacct 


aegecaatta 


CCQA 

OOoU 


agaaaggaaa 


tggaatcgaa 


ggacgcctca 


ttgcccacat 


tatctcaatg 


gaaactccac 


O D 4 U 


attgtctccg 


aaaataactt 


tcctacagca 


gctggtttag 


cttcct cege 


^ r^r /"^ 4* rr « /"i 4- 4- +* 


3 / UU 


gctgcattgg 


tctctgcaat 


tgctaagtta 


taccaattac 


cacayuCddC 


LLCayaaata 


^7 fin 

O / OU 


tctagaatag 


caagaaaggg 


gtctggttca 


gcttgtagat 


z-^ j^-r 4~ 4* 4 4* ^™ f*w 

cgutgnu ugg 


cggatacgng 


JOiU 


gcctgggaaa 


tgggaaaagc 


tgaagatggt 


catgattcca 


tggcagtaca 


aategcagae 


CL Q Q n 


agctctgact 


ggcctcagat 


gaaagcttgt 


gtcctagttg 


teagegatat 


taaaaaggai. 


o y fi u 


gtgagttcca 


ctcagggtat 


gcaattgacc 


gtggcaacct 


ccgaactatt 


4" ^ o a rr *3 ^ ^3 

Lddagdddya 


OU UU 


attgaacatg 


t cgtaccaaa 


gagatttgaa 


gtcatgcgt a 


aagccarugr 


L ydaaaayd 1_ 


finfin 

O U OU 


ttcgccacct 


ttgcaaagga 


aacaatgatg 


gattccaact 


jr-m. Xm £~ , +*\ +- A 

ctttccdLgc 


cacaugu uug 


fil 90 


gactctttcc 


ctccaatatt 


ctacatgaat 


gacactt cca 


agegtatcat 


cagt t ggtgc 


fii pn 

DIOU 


cacaccatta 


atcagtttta 


cggagaaaca 


atcgttgcat 


acacgtttga 


tgcaggtcca 


6240 


aatgctgtgt 


tgtactactt 


agctgaaaat 


gagtcgaaac 


tetttgeatt 


'tatctataaa 


6300 


ttgtttggct 


ctgttcctgg 


atgggacaag 


aaatttacta 


ctgagcagct 


tgaggctttc 


6360 


aaccatcaat 


ttgaatcatc 


taactttact 


gcacgtgaat 


tggatcttga 


gttgcaaaag 


6420 



} - . 
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gatgttgcca gagtgatttt aactcaagtc ggttcaggcc cacaagaaac aaacgaatct 6480 

ttgattgacg caaagactgg tctaccaaag. gaagaggagt tttaactcga cgccggcgga 6540 

ggcacatatg tctcagaacg tttacattgt atcgactgcc agaaccccaa ttggttcatt 6600 

ccagggttct ctatcctcca agacagcagt ggaattgggt gctgttgctt taaaaggcgc 6660 

cttggctaag gttccagaat tggatgcatc caaggatttt gacgaaatta tttttggtaa 6720 

cgttctttct gccaatttgg gccaagctcc ggccagacaa gttgctttgg ctgccggttt 6780 

gagtaatcat atcgttgcaa gcacagttaa caaggtctgt gcatccgcta tgaaggcaat 6840 

cattttgggt gctcaatcca tcaaatgtgg taatgctgat gttgtcgtag ctggtggttg 6900 

tgaatctatg actaacgcac catactacat gccagcagcc cgtgcgggtg ccaaatttgg 6960 

ccaaactgtt cttgttgatg gtgtcgaaag agatgggttg aacgatgcgt acgatggtct .7020 

agccatgggt gtacacgcag aaaagtgtgc ccgtgattgg gatattacta gagaacaaca 7080 

agacaatttt gccatcgaat cctaccaaaa atctcaaaaa tctcaaaagg aaggtaaatt 7140 

cgacaatgaa attgtacctg ttaccattaa gggatttaga ggtaagcctg atactcaagt 7200 

cacgaaggac gaggaacctg ctagattaca cgttgaaaaa ttgagatctg caaggactgt 7260 

tttccaaaaa gaaaacggta ctgttactgc cgctaacgct tctccaatca acgatggtgc 7320 

tgcagccgtc atcttggttt. ccgaaaaagt tttgaaggaa aagaatttga agcctttggc 7380.. 

tattatcaaa ggttggggtg aggccgctca tcaaccagct gattttacat gggctccatc 7440 

tcttgcagtt ccaaaggctt tgaaacatgc tggcatcgaa gacatcaatt ctgttgatta 7500 

ctttgaattc .;aatgaagcct tttcggttgt cggtttggtg aacactaaga ttttgaagct - 7560 

agacccatct aaggttaatg tatatggtgg tgctgttgct ctaggtcacc cattgggttg 7620 

ttctggtgct aga'gtggttg ttacactgct atccatctta cagcaagaag gaggtaagat 7680 

cggtgttgcc gccatttgta atggtggtgg tggtgcttcc tctattgtca ttgaaaagat 7740 

atgaggatcc tctagatgcg caggaggcac. atatggcgaa gaacgttggg attttggcta 7800 

tggatatcta tttccctccc acctgtgttc aacaggaagc tttggaagca catgatggag 7860 

caagtaaagg gaaatacact attggacttg gccaagattg .tttagctttt tgcactgagc 7920 

ttgaagatgt tatctctatg agtttcaatg cggtgacatc actttttgag aagtataaga 7980 

ttgaccctaa ccaaatcggg cgtcttgaag taggaagtga gactgttatt gacaaaagca 8040 

agtccatcaa gaccttcttg atgcagctct ttgagaaatg tggaaacact gatgtcgaag 8100 

gtgttgactc gaccaatgct tgctatggtg gaactgcagc tttgttaaac tgtgtcaatt 8160 

gggttga.gag taactcttgg gatggacgtt atggcctcgt catttgtact gacagcgcgg 8220 

tttatgcaga aggacccgca aggcccactg gaggagctgc agcgattgct atgttgatag 8280 

gacctgatgc tcctatcgtt ttcgaaagca aattgagagc aagccacatg gctcatgtct 834 0 
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atgactttta caagcccaat cttgctagcg agtacccggt tgttgatggt aagctttcac 84 00 

agacttgcta cctcatggct cttgactcct gctataaaca tttatgcaac aagttcgaga 84 60 

agatcgaggg caaagagttc tccataaatg atgctgatta cattgttttc cattctccat 8520 

acaataaact tgtacagaaa agctttgctc gtctcttgta caacgacttc ttgagaaacg 8580 

caagctccat tgacgaggct gccaaagaaa agttcacccc ttattcatct ttgacccttg 8 64 0 

acgagagtta ccaaagccgt gatcttgaaa aggtgtcaca acaaatttcg aaaccgtttt 8700 

atgatgctaa agtgcaacca acgactttaa taccaaagga agtcggtaac atgtacactg 8760 

cttctctcta cgctgcattt gcttccctca tccacaataa acacaatgat ttggcgggaa 8820 

agcgggtggt tatgttctct tatggaagtg gctccaccgc aacaatgt.tc tcattacgcc 8880 

tcaacgacaa taagcctcct ttcagcattt caaacattgc atctgtaatg gatgttggcg 8940 

gtaaattgaa agctagacat gagtatgcac ctgagaagtt tgtggagaca atgaagctaa 9000 

tggaacatag gtatggagca aaggactttg tgacaaccaa . ggagggtatt atagatcttt 9060 

tggcaccggg aacttattat ctgaaagagg ttgattcctt gtaccggaga ttctatggca 9120 

agaaaggtga agatggatct gtagccaatg gacactgagg atccgtcgag cacgtggagg 9180 

cacatatgca atgctgtgag atgcctgttg gatacattca gattcctgtt gggattgctg 924 0 

gtccattgtt gcttgatggt tatgagtact ctgttcctat ggctacaacc gaaggttgtt . 9300. 

tggttgctag cactaacaga ggctgcaagg ctatgtttat ctctggtggc gccaccagta 9360 

ccgttcttaa ggacggtatg acccgagcac ctgttgttcg gttcgcttcg gcgagacgag 9420 

cttcggagct taagtttttc ttggagaatc cagagaactt tgatactttg gcagtagtct 9480 

tcaacaggtc gagtagattt gcaagactgc aaagtgttaa atgcacaatc gcggggaaga 9540 

atgcttatgt aaggttctgt tgtagtactg gtgatgctat ggggatgaat atggtttcta 9600 

aaggtgtgca gaatgttctt gagtatctta ccgatgattt ccctgacatg gatgtgattg 9660 

gaatctctgg taacttctgt tcggacaaga aacctgctgc tgtgaactgg attgagggac 9720 

gtggtaaatc agttgtttgc gaggctgtaa tcagaggaga gatcgtgaac . aaggtcttga 9780 

aaacgagcgt- ggctgcttta gtcgagctca acatgctcaa gaacctagct ggctctgctg 9840 

ttgcaggctc tctaggtgga ttcaacgctc atgccagtaa catagtgtct gctgtattca 9900 

tagctactgg ccaagatcca gctcaaaacg tggagagttc tcaatgcatc accatgatgg 9960 

aagctattaa tgacggcaaa gatatccata tctcagtcac tatgccatct atcgaggtgg 10020 

ggacagtggg aggaggaaca cagcttgcat ctcaatcagc gtgtttaaac ctgctcggag 10080 

ttaaaggagc aagcacagag tcgccgggaa tgaacgcaag gaggctagcg acgatcgtag 1014 0 

ccggagcagt tttagctgga gagttatctt taatgtcagc aattgcagct ggacagcttg 10200 

tgagaagtca catgaaatac aatagatcca gccgagacat ctctggagca acgacaacga 10260 
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caacaacaac aacatgaccc gtaggaggca catatgagtt cccaacaaga gaaaaaggat 10320 
tatgatgaag aacaattaag gttgatggaa gaagtttgta tcgttgtaga tgaaaatgat 10380 
gtccctttaa gatatggaac gaaaaaggag tgtcatttga tggaaaatat aaataaaggt 10440 
cttttgcata gagcattctc tatgttcatc tttgatgagc aaaatcgcct tttacttcag 10500 

cagcgtgcag aagagaaaat tacatttcca tccttatgga cgaatacatg ttgctcccac 10560 

ccattggatg ttgctggtga acgtggtaat actttacctg aagctgttga aggtgttaag 10620 

aatgcagctc aacgcaagct gttccatgaa ttgggtattc aagccaagta tattcccaaa 10680 

gacaaatttc agtttcttac acgaatccat taccttgctc ctagtactgg tgcttgggga 10740 

gagcatgaaa ttgactacat tcttttcttc aaaggtaaag ttgagctgga tatcaatccc 10800 

aatgaagttc aagcctataa gtatgttact atggaagagt taaaagagat gttttccgat 108 60 

cctcaatatg gattcacacc atggttcaaa cttatttgtg agcattttat gtttaaatgg 10920 

tggcaggatg tagatcatgc gtcaaaattc caagatacct taattcatcg ttgctaagga 10980 

tcccccggga tccggccgat ctaaacaaac ccggaacaga ccgttgggaa gcgattcagt 11040 

aattaaagct tcatgactcc tttttggttc ttaaagtccc tttgaggtat caactaataa 11100 

gaaagatatt agacaacccc ccttttttct ttttcacaaa taggaagttt cgaatccaat 11160 

ttggatatta aaaggattac cagatataac acaaaatctc tccacctatt ccttctagtc 11220 

gagcctctcg gtctgtcatt . atacctcgag aagtagaaag aattacaatc cccattccac 11280 

ctaaaattcg cggaattcgt tgataattag aatagattcg tagaccaggt cgactgattc 1134 0 

gttttaaatt taaaatattt ctatagggtc ttttcctatt ccttctatgt "cgcagggtta .11400* 

aaaccaaaaa atatttgttt ttttctcgat gttttctcac gttttcgata aaaccttctc H'460 

gtaaaagtat ttgaacaata ttttcggtaa tattagtaga tgctattcga accacccttt 11520 

ttcgatccat atcagcattt cgtatagaag ttattatctc agcaatagtg tccctaccca 11580 

tgatgaacta aaattattgg ggcctccaaa tttgatataa tcaacgtgtt ttttacttat 11640 

tttttttttg aatatgatat gaattattaa agatatatgc gtgagacaca atctactaat 11700 

taatctattt ctttcaaata ccccactaga aacagatcac * aatttcattt tataatacct 117 60 

cgggagctaa tgaaactatt ttagtaaaat ttaattctct caattcccgg gcgattgcac 11820 

caaaaattcg agttcctttt gatttccttc cttcttgatc aataacaact gcagcattgt 11880 

catcatatcg tattatcatc ccgttgtcac gtttgagttc tttacaggtc cgcacaatta . 11940 

cagctctgac tacttctgat ctttctaggg gcatatttgg tacggcttct ttgatcacag 12000 

caacaataac gtcaccaata tgagcatatc gacgattgct agctcctatg attcgaatac 12060 

acatcaattc tcgagccccg ctgttatccg ctacatttaa atgggtctga ggttgaatca 12120 

tttttttaat ccgttctttg aatgcaaagg gcgaagaaaa aaaagaaata tttttgtcca 12180 
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gccgcagtgt tatcactcat ggttatggca gcactgcata attctcttac tgtcatgcca 14160 

tccgtaagat gcttttctgt gactggtgag tactcaacca agtcattctg agaatagtgt 14220 

atgcggcgac cgagttgctc ttgcccggcg tcaatacggg ataataccgc gccacatagc 14280 

agaactttaa aagtgctcat cattggaaaa cgttcttcgg ggcgaaaact ctcaaggatc 14340 

ttaccgctgt tgagatccag ttcgatgtaa cccactcgtg cacccaactg atcttcagca 14400 

tcttttactt tcaccagcgt ttctgggtga gcaaaaacag gaaggcaaaa tgccgcaaaa 14 4 60 

aagggaataa gggcgacacg gaaatgttga atactcatac tcttcctttt tcaatattat 14520 

tgaagcattt atcagggtta ttgtctcatg agcggataca tatttgaatg tatttagaaa 14580 

aataaacaaa taggggttcc gcgcacattt ccccgaaaag tgc 14 623 
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